RAG: Retrieval-Augmented Generation—AI systems that answer questions by first searching for relevant documents
RAFT: Retrieval Augmented Fine-Tuning—a method that fine-tunes LLMs to ignore distractor documents
conditional memorization bias: A failure mode where an LLM learns to rely on context or memory based on static training data assignments rather than the actual relevance of the text
canonical answer overfitting: When an LLM memorizes the specific phrasing of a single ground-truth answer instead of the underlying semantic knowledge
replay buffer: A collection of examples from previous tasks used during training to prevent the model from forgetting earlier skills
LoRA: Low-Rank Adaptation—a parameter-efficient fine-tuning technique
nucleus sampling: A decoding strategy that samples from the smallest set of top-v tokens whose cumulative probability exceeds a threshold p
catastrophic forgetting: The tendency of neural networks to abruptly forget previously learned information upon learning new information
domain identifier: A specific token or phrase prepended to inputs to signal the model to switch to a specific domain context