Sys2-FT: System-2 Fine-tuning—a method where the model generates its own training data (QA pairs, paraphrases) based on new information before fine-tuning on that data
New News: A dataset of 75 hypothetical but plausible news items across domains (math, coding, events) with downstream questions requiring reasoning
Contextual Shadowing Effect: A phenomenon where placing the fact to be learned in the context during fine-tuning prevents the model from encoding it into weights, as the model relies on the context instead
Curse of Overexposure: A phenomenon where fine-tuning on a specific fact degrades the model's ability to perform in-context learning on that same fact
Self-QA: A specific Sys2-FT protocol where the model generates question-answer pairs about the new information to use as fine-tuning data
Replay elements: Self-generated artifacts (paraphrases, implications, QAs) produced by the model when processing new news, used as training data
ICL: In-Context Learning—the ability of a model to perform tasks based on instructions or examples provided in the prompt without updating weights
LoRA: Low-Rank Adaptation—a parameter-efficient fine-tuning technique
FT-ICL gap: The performance difference between a model fine-tuned on information versus one provided the information in its context window