Knowledge Editing (KE): Techniques to update specific facts in an LLM (e.g., 'The president is X' -> 'The president is Y') without retraining the entire model
Locate-then-edit: A KE paradigm that identifies specific model weights responsible for a fact and directly modifies them
MEMIT: Mass-Editing Memory in a Transformer—a specific locate-then-edit method that updates multiple facts simultaneously by modifying MLP layers
Hop Words: Entities in a knowledge graph (like Wikidata) that are directly connected (one hop away) to the subject or object of a fact; used here as potent distractors
Prefix Context: Text appearing before the actual query or prompt (e.g., conversation history), which can influence the model's generation
Hidden State Variance: The degree to which the model's internal representation changes when the input context changes; CoRE aims to minimize this for the edited fact
Key-Value Memory: Interpretation of Transformer MLP layers where the input acts as a 'key' to retrieve a 'value' (fact) stored in the weights
KL Divergence: A statistical measure of how one probability distribution differs from another; used here to ensure the model doesn't change unrelated knowledge
G-Eval: A framework using LLMs (like GPT-4) to evaluate the quality or coherence of text