Base Model: Evaluation covers 24 LLMs including GPT-4, Llama-3, Mistral, OLMo, and OpenELM
Training Method: Knowledge Editing (ROME, MEMIT, SERAC, IKE) applied to subsets of models (GPT-2, GPT-J, Llama-2-Chat, Mistral-Instruct)
Adaptation: Model editing modifies weights (ROME, MEMIT) or uses external memory/context (SERAC, IKE)
Trainable Parameters: Varies by editing method (feed-forward layers for ROME/MEMIT, external classifier for SERAC)
Training Data:
- 130 time-sensitive facts
- Subjects: Top 50 countries by GDP, top 30 athletes, top 25 organizations
Compute: SERAC training required 1x NVIDIA A100 (80GB). Inference used 2x NVIDIA RTX 3090 (24.5GB) or 3x A100 for Mixtral.