CMT: Context Utilisation Manipulation Technique—any method (prompting, fine-tuning, decoding, etc.) designed to control how an LM uses provided context
Context Utilisation: The extent to which an LM relies on external context versus its internal parametric memory to generate an answer
Gold context: Retrieved information that is relevant and factually correct
Conflicting context: Retrieved information that is relevant but contradicts the LM's pre-existing internal memory (e.g., updated facts)
Irrelevant context: Retrieved information that provides no help in answering the query and acts as a distractor
BCU: Binary Context Utilisation—a metric scoring 1 if the model outputs the context-supported answer (for relevant/conflicting) or the memory-supported answer (for irrelevant), and 0 otherwise
Parametric memory: Knowledge stored within the model's pre-trained weights
CounterFact: A diagnostic dataset containing synthesized facts designed to conflict with an LM's internal knowledge
LAMA: Language Model Analysis—a dataset of facts used to probe what LMs know
Mechanistic intervention: Modifying internal model components (like attention heads) during inference to steer behavior
Contrastive decoding: Adjusting the probability of the next token by comparing the logits of the model with context vs. without context