LCoT: Long Chain-of-Thought—a reasoning strategy where models engage in deliberate, step-by-step thinking before answering
LCoT2Tree: The proposed framework that converts sequential reasoning text into a hierarchical tree structure for analysis
Overthinking: A phenomenon where increasing the length of a reasoning chain does not improve, or even degrades, the final answer quality
GATv2: Graph Attention Network v2—a GNN architecture used here to process the extracted reasoning trees
Backtracking: A structural pattern where the reasoning process reverts to a previous state to try a different path
Best-of-N: A decoding strategy where N samples are generated, and a selector (in this case, the tree-based classifier) picks the best one
PRM: Process Reward Model—a model trained to score the intermediate steps of reasoning
System 2 thinking: Slow, deliberate, and logical reasoning processes, often emulated by models like DeepSeek-R1