Model Editing: Methods to directly update specific parameters inside a model to correct individual facts (e.g., changing 'The PM of UK is X' to 'Y')
Drawdown: A metric measuring the unintended negative impact of model edits on other, unrelated knowledge or capabilities (similar to catastrophic forgetting)
Ripple Effect: The phenomenon where changing one fact (A implies B) requires logically updating all downstream consequences (B), which current editing methods struggle to do
LAMA: LAnguage Model Analysis—a benchmark testing factual knowledge in LLMs using fill-in-the-blank queries
Concept Erasure: A method to remove specific concepts (like gender or race bias) from model representations to prevent them from influencing generation
RAG: Retrieval-Augmented Generation—systems that fetch relevant text from external databases to answer queries, separating knowledge storage from reasoning
AIS: Attributed Information Sources—a framework for grounding generated text in identifiable sources
Ontology Subsumption: Inference tasks involving hierarchical relationships between concepts (e.g., if A is a Dog, A is also an Animal), used to test logical consistency