← Back to Paper List

The Law of Knowledge Overshadowing: Towards Understanding, Predicting, and Preventing LLM Hallucination

Yuji Zhang, Sha Li, Cheng Qian, Jiateng Liu, Pengfei Yu, Chi Han, Y. Fung, Kathleen McKeown, ChengXiang Zhai, Manling Li, Heng Ji
University of Illinois Urbana-Champaign, Columbia University, Northwestern University, Stanford University
Annual Meeting of the Association for Computational Linguistics (2025)
Factuality Pretraining Benchmark

📝 Paper Summary

Factuality and Hallucination Mechanism Interpretability Decoding Strategies
The paper identifies 'knowledge overshadowing'—where dominant knowledge suppresses less frequent facts—as a primary cause of hallucinations and proposes a log-linear law to predict it and a decoding strategy to mitigate it.
Core Problem
LLMs hallucinate even when trained on strictly factual data because dominant (popular) knowledge patterns suppress less prominent correct information during generation.
Why it matters:
  • Current beliefs often attribute hallucinations solely to low-quality or incorrect training data, but this paper shows error persists even with 100% factual corpora
  • Existing methods detect hallucinations only after generation; there is no principled way to predict hallucination rates based on training data characteristics before training
  • Reliability in high-stakes domains requires understanding why models prioritize wrong associations (e.g., associating a country with its leader instead of a requested singer)
Concrete Example: When queried for 'famous singer in North Korea', the model incorrectly generates 'Kim Jong Un'. The strong association between 'North Korea' and 'Kim Jong Un' overshadows the specific constraint 'singer', causing the model to misassemble facts.
Key Novelty
The Law of Knowledge Overshadowing and CoDa Decoding
  • Introduces a log-linear law stating hallucination rates scale linearly with the logarithm of knowledge popularity, knowledge length, and model size
  • Proposes CoDa (Contrastive Decoding to Amplify Overshadowed Knowledge), which detects overshadowed concepts by masking dominant tokens and contrasting distributions to amplify the suppressed correct information
Architecture
Architecture Figure Figure 1
Conceptual illustration of knowledge overshadowing. It shows two knowledge pieces: 'Kim Jong Un is a politician in North Korea' (Dominant) and 'Ri Sol-ju is a singer in North Korea' (Overshadowed).
Evaluation Highlights
  • +27.9% improvement in factuality on the custom Overshadowing dataset using the proposed CoDa decoding strategy
  • +13.1% improvement on the MemoTrap benchmark compared to standard decoding
  • +18.3% improvement on the NQ-Swap benchmark, demonstrating generalization to diverse factual tasks
Breakthrough Assessment
8/10
Significant theoretical contribution by formulating a scaling law for hallucinations, coupled with a practical, training-free decoding solution that yields substantial improvements.
×