Token Complexity: The minimum number of tokens an LLM requires to successfully solve a specific problem; accuracy drops sharply if the response is shorter than this threshold
CoT: Chain-of-Thought—a prompting technique where models generate intermediate reasoning steps before the final answer
Universal Trade-off Curve: The observation that regardless of the specific prompting strategy (formatting, language, constraints), reasoning accuracy is primarily a function of response length
Rate-Distortion Theory: A concept from information theory describing the minimum amount of data required to represent a signal with a given level of distortion; used here to bound optimal compression
Knapsack Problem: A combinatorial optimization problem; used here to model the selection of optimal chain lengths to maximize accuracy under a total token budget
GSM8K: A benchmark dataset of grade school math word problems
MATH-500: A subset of the MATH dataset containing challenging mathematics problems
MMLU-Pro Math: A subset of the MMLU-Pro benchmark focusing on higher-level mathematical reasoning