← Back to Paper List

CoCoA: Confidence-and Context-Aware Adaptive Decoding for Resolving Knowledge Conflicts in Large Language Models

A Khandelwal, M Gupta, P Agrawal
Microsoft, India
arXiv, 8/2025 (2025)
RAG Factuality QA

📝 Paper Summary

Modularized RAG pipeline Hallucination suppression
CoCoA dynamically adjusts reliance on external context during decoding by measuring the entropy gap, Rényi divergence, and contextual peakedness to resolve conflicts between model priors and retrieved evidence.
Core Problem
Standard RAG decoding often fails when retrieved context conflicts with the model's internal memory (model stubbornness), while existing contrastive methods use static weights or uniform divergence metrics that over-correct in low-conflict scenarios.
Why it matters:
  • Language models frequently prioritize outdated internal knowledge over up-to-date retrieved context, leading to hallucinations in RAG systems
  • Current adaptive methods like AdaCAD saturate on peaked distributions and fail to distinguish meaningful context signals from noise, degrading performance when context and memory actually agree
Concrete Example: In a QA task, if the model 'knows' the answer is A but the context says B, standard decoding might output A. AdaCAD might force B even if the context is noisy. CoCoA detects the specific 'peakedness' of B in the context distribution to trust B only when the signal is strong and the conflict is meaningful.
Key Novelty
Confidence- and Context-Aware Adaptive Decoding (CoCoA)
  • Uses Rényi divergence instead of Jensen-Shannon Divergence to detect 'tail-heavy' shifts, making the model sensitive to subtle conflicts where the context boosts a low-probability token
  • Introduces 'contextual peakedness' (margin between top-2 tokens) combined with entropy gap to measure how certain the context is, ensuring the model only yields to context when the context is confident
  • Employs a dynamic gating mechanism that blends prior and context distributions based on a conflict score derived from divergence and uncertainty measures
Architecture
Architecture Figure Figure 1
Conceptual framework of CoCoA comparing standard decoding, CAD, and CoCoA. It illustrates how CoCoA blends distributions using conflict and confidence measures.
Evaluation Highlights
  • Achieves up to +9.2 points average accuracy improvement over the strong baseline AdaCAD across QA benchmarks (NQ, TriviaQA, PopQA, HotpotQA, TabMWP)
  • Outperforms GPT-4o-mini by +4.43 ROUGE-L and +2.10 FaithScore on CLAPNQ when applied to Llama-3.1-70B
  • Attains 86.32 AlignScore on TofuEval summarization, surpassing greedy decoding by 9.66 points and AdaCAD by 1.25 points
Breakthrough Assessment
8/10
Strong, consistent improvements over state-of-the-art adaptive decoding methods (AdaCAD) across diverse tasks. The use of Rényi divergence and entropy gap offers a theoretically grounded improvement for handling subtle distribution shifts.
×