โ† Back to Paper List

Intrinsic Stability Limits of Autoregressive Reasoning: Structural Consequences for Long-Horizon Execution

Hsien-Jyh Liao
arXiv (2026)
Reasoning RL Benchmark

๐Ÿ“ Paper Summary

Theoretical Analysis of LLMs Long-Horizon Reasoning
Autoregressive reasoning has an intrinsic stability limit where decision advantage decays exponentially with length, necessitating discrete segmentation into graph-like structures rather than continuous linear chains.
Core Problem
Long-horizon reasoning in LLMs frequently collapses not due to task complexity, but because the autoregressive process itself accumulates internal uncertainty that erodes directional alignment over time.
Why it matters:
  • Current methods attribute failure to search complexity or credit assignment, missing the fundamental process-level instability inherent to autoregressive generation.
  • Scaling laws capture aggregate performance but fail to predict structural breakdown in extended reasoning trajectories.
  • Without understanding this limit, purely linear Chain-of-Thought approaches will inevitably fail on sufficiently long tasks regardless of model size.
Concrete Example: In a linear, unbranched task (like a long chain of logical deductions without ambiguity), a model eventually 'hallucinates' or drifts from the objective simply because the noise from each step accumulates, driving the decision advantage to zero.
Key Novelty
Intrinsic Process-Level Instability Theorem
  • Proposes that reasoning failure is a dynamical system stability problem, not just a search problem.
  • Derives 'Theorem A': a mathematical bound showing that decision advantage decays exponentially with reasoning length due to contraction-like dynamics of noisy updates.
  • Identifies a 'critical length' L* beyond which single-path execution becomes statistically indistinguishable from noise, necessitating a switch to graph-based (DAG) structures.
Evaluation Highlights
  • Theoretical derivation of a critical reasoning length L* where decision advantage drops below a reliability threshold.
  • Establishment of an exponential decay law for decision advantage in single-path autoregressive reasoning.
  • Conceptual mapping of stable reasoning to Directed Acyclic Graphs (DAGs) where edge lengths must remain below L*.
Breakthrough Assessment
9/10
Provides a fundamental theoretical limit (similar to the bandwidth theorem in signal processing) for autoregressive reasoning, challenging the assumption that CoT can scale indefinitely without structural resets.
×