← Back to Paper List

Quantifying reliance on external information over parametric knowledge during Retrieval Augmented Generation (RAG) using mechanistic analysis

R Ghosh, R Seetharaman, H Wadhwa…
Microsoft, University of Massachusetts, Amherst, University of Maryland, College Park
arXiv, 10/2024 (2024)
RAG Factuality

📝 Paper Summary

Mechanistic Interpretability RAG Behavior Analysis
Language models use a "shortcut" mechanism in RAG settings, heavily biasing attention toward retrieved context tokens while suppressing reliance on internal parametric memory for factual predictions.
Core Problem
While RAG is widely used to improve factual accuracy, the internal mechanical process by which LLMs prioritize retrieved context over their pre-trained (parametric) knowledge is not clearly understood.
Why it matters:
  • Understanding how models balance internal vs. external knowledge is crucial for diagnosing hallucinations and inconsistencies in RAG systems
  • Previous work focused on editing knowledge (ROME, MEMIT) or system-level RAG performance, leaving a gap in mechanistic understanding of the inference process itself
Concrete Example: When a model answers 'Paris' for 'The Eiffel Tower is located in...', it typically relies on internal weights. When provided with a document saying 'The Eiffel Tower is in Las Vegas', we do not mechanistically know if the model suppresses its internal 'Paris' weights or simply overwrites the output at the last layer.
Key Novelty
Mechanistic "Shortcut" Analysis of RAG
  • Applies Causal Mediation Analysis to compare internal activation patterns between standard generation and RAG-based generation
  • Demonstrates that the presence of context causes a 'shortcut' effect: the model effectively bypasses the usual internal computation path rooted in the subject token (e.g., 'Eiffel Tower') and instead attends directly to the answer token provided in the context
Evaluation Highlights
  • ~10x decrease in Average Indirect Effect (AIE) on the Last Subject Token for Llama-2-7B when RAG context is added, indicating reduced reliance on internal memory
  • ~35x decrease in AIE on the Last Subject Token for Phi-2 (2.7B) in RAG settings compared to vanilla generation
  • Knocking out attention from the subject token reduces answer probability by ~20-25% in vanilla models, but less than 5% in RAG settings, confirming the shift in reliance
Breakthrough Assessment
7/10
Provides valuable mechanistic evidence confirming intuitions about RAG behavior (context bias). While the findings are expected, quantifying them via causal tracing and attention knockouts adds rigorous interpretability.
×