Chain-of-Thought (CoT): A prompting technique where the model is encouraged to generate intermediate reasoning steps (rationales) before the final answer
In-Context Learning (ICL): The ability of a model to learn a task from a few examples provided in the prompt at inference time, without weight updates
Pretrained Priors: Knowledge and patterns encoded in the model's weights during pre-training, often reflecting common sense or factual world knowledge
Slow Thinking: Also known as test-time scaling; a mode where the model generates longer, more detailed reasoning chains to improve accuracy on complex tasks
False-Answer CoT: A robustness test where prompt exemplars have correct reasoning but incorrectly flipped final answers
False-Rationale CoT: A robustness test where prompt exemplars have correct final answers but logically flawed or noisy reasoning steps
Greedy Decoding: A decoding strategy where the model selects the highest-probability token at each step
Task-agnostic CoT: Using reasoning exemplars from a completely different domain (e.g., using Sports examples to prompt for Math) to test if the model just copies format