LRM: Large Reasoning Model—LLMs trained specifically for complex reasoning (e.g., DeepSeek-R1, o1)
MI: Mutual Information—a measure of the mutual dependence between two random variables (here, the model's hidden state and the correct answer)
HSIC: Hilbert-Schmidt Independence Criterion—a statistical measure used to estimate Mutual Information by mapping distributions into a Reproducing Kernel Hilbert Space
Thinking Tokens: Specific tokens (e.g., 'Wait', 'Hmm', 'Therefore') identified by the paper as having high Mutual Information with the correct answer
MI Peaks: Sudden, significant increases in the Mutual Information trajectory during the generation process
Representation Recycling: A proposed method where the informative hidden states at MI peaks are iterated multiple times through the model to refine reasoning
RKHS: Reproducing Kernel Hilbert Space—a space of functions used in kernel methods (like HSIC) to measure dependencies between complex, high-dimensional data
TTTS: Thinking Token based Test-time Scaling—a method that forces the model to generate thinking tokens when extra compute budget is available