TIR: Tool-Integrated Reasoning—interleaving natural language thought with executable code blocks (e.g., Python) to solve sub-problems
CoT: Chain-of-Thought—a prompting method where the model generates intermediate reasoning steps before the final answer
GenSelect: Generative Solution Selection—a method where a model is presented with multiple candidate solution summaries and generates a reasoning trace to select the best one
RoPE: Rotary Positional Embeddings—a technique for encoding position information in Transformers; here, the base frequency is scaled to support longer context windows
Pass@k: The probability that at least one of the k generated solutions is correct
Maj@k: The accuracy obtained by taking the most frequent answer (majority vote) among k generated solutions
SFT: Supervised Fine-Tuning—training a pre-trained model on a labeled dataset
Speculative Decoding: An inference acceleration technique where a small 'drafter' model proposes tokens that are verified by the larger target model
Model Merging: Combining the weights of two different fine-tuned models (e.g., via linear interpolation) to blend their capabilities