โ† Back to Paper List

DoLa: Decoding by contrasting layers improves factuality in LLMs

(MIT/MS) Yung-Sung Chuang, Yujia Xie, Hongyin Luo, Yoon Kim, James Glass, Pengcheng He
Massachusetts Institute of Technology, Microsoft
ICLR (2024)
Factuality Reasoning Benchmark

๐Ÿ“ Paper Summary

Hallucination suppression Decoding strategies
DoLa improves LLM factuality by contrasting output logits from mature final layers against premature lower layers to amplify factual knowledge, without retrieval or fine-tuning.
Core Problem
Large language models frequently hallucinate by generating content that deviates from real-world facts, often prioritizing linguistic patterns (mass-seeking behavior) over factual accuracy.
Why it matters:
  • Hallucinations prevent safe deployment in high-stakes applications like clinical or legal settings where trustworthiness is crucial
  • Existing solutions often require expensive external retrieval, additional fine-tuning, or human labels, which may not always be feasible
  • Language models tend to learn 'lower-level' linguistic information in early layers and semantic/factual information in later layers, but standard decoding doesn't exploit this distinction
Concrete Example: When asked 'On what date was the Declaration of Independence officially signed?', a standard LLaMA model predicts 'July 4, 1776' (a common but factually incorrect date for the signing). DoLa contrasts layers to suppress this common misconception and correctly predicts 'August 2, 1776'.
Key Novelty
Decoding by Contrasting Layers (DoLa)
  • exploits the modular evolution of knowledge in transformers, where lower layers encode linguistic patterns and higher layers encode facts
  • dynamically selects a 'premature' layer based on Jensen-Shannon Divergence and subtracts its log-probabilities from the final layer to cancel out non-factual linguistic noise
  • amplifies the signal of factual knowledge that emerges only in the later layers of the model
Architecture
Architecture Figure Figure 1 & 3
Conceptual illustration of DoLa. It shows the transformer layers processing a query about a capital city. The probabilities of the correct fact ('Olympia') rise in higher layers while common tokens ('Seattle') stay constant.
Evaluation Highlights
  • Improves TruthfulQA scores by 12-17% (absolute points) across LLaMA family models (7B to 65B)
  • Raises LLaMA-65B performance on TruthfulQA to 54.3% (%Truth*Info), rivaling methods that require supervised fine-tuning like ITI
  • Enhances reasoning on StrategyQA by up to 4% accuracy, showing benefits for chain-of-thought tasks
Breakthrough Assessment
8/10
Simple, inference-time-only method that yields significant double-digit gains in factuality without training or retrieval. Highly practical.
×