DyKnow: A dynamic benchmarking framework that updates data points using real-time information from Wikidata to evaluate LLM temporal accuracy.
ENAF: Entity-Aware Fine-tuning—a method that introduces structured entity representations (like unique IDs) during fine-tuning to unify fragmented knowledge.
ROME: Rank-One Model Editing—a method to edit specific facts in an LLM by modifying MLP weights.
MEMIT: Mass-Editing Memory in a Transformer—a method allowing the update of thousands of factual associations in an LLM simultaneously.
Subject Perturbation: Evaluating model consistency by changing the name of the subject entity (e.g., using 'CR7' instead of 'Cristiano Ronaldo').
Property Perturbation: Evaluating model consistency by rephrasing the relationship query (e.g., 'head of state' vs 'president').
Prompt Agreement: A metric measuring the consistency of model outputs across different variations (perturbations) of the same question.
Soft Neurosymbolic: Combining neural network learning with symbolic representations (like entity tags) without fully rigid symbolic logic constraints.
Wikidata: A collaborative, structurally edited knowledge base where facts are stored as triples with qualifiers for temporal validity.