decision advantage: A metric (rho) measuring how much better a model's current state aligns with the correct target proposition compared to the negation/incorrect path.
autoregressive reasoning: The process where an LLM generates reasoning steps sequentially, with each step conditioning on all previous steps.
L* (Critical Length): The theoretical maximum length a linear reasoning chain can reach before the accumulated noise makes the decision advantage drop below a reliable threshold.
contraction coefficient: A value (eta < 1) representing the rate at which the stochastic transition kernel reduces the distinguishability between distributions (accumulates uncertainty).
DAG: Directed Acyclic Graphโa structured arrangement of reasoning steps where multiple paths or consolidated nodes replace a single linear chain to maintain stability.
structural governance: The mechanism of organizing reasoning into stable segments (nodes and edges) rather than letting it run as a continuous unstructured stream.
total variation distance: A statistical distance measure used here to quantify the distinguishability between the distribution of states in correct vs. incorrect reasoning trajectories.