← Back to Paper List

Detecting Contextual Hallucinations in LLMs with Frequency-Aware Attention

Siya Qi, Yudong Chen, Runcong Zhao, Qinglin Zhu, Zhanghao Hu, Wei Liu, Yulan He, Zheng Yuan, Lin Gui
University of Warwick, Tencent, King's College London
arXiv (2026)
Factuality RAG QA

📝 Paper Summary

Hallucination suppression Factuality
Hallucinated tokens in LLMs exhibit rapid, high-frequency fluctuations in attention weights, which can be detected by treating attention as a discrete signal and extracting high-frequency energy components.
Core Problem
Existing attention-based hallucination detectors rely on coarse summary statistics (like entropy or total mass) that fail to capture the fine-grained sequential instability characteristic of ungrounded generation.
Why it matters:
  • Hallucinations in context-based generation (like RAG or summarization) undermine trust in LLM systems expected to be grounded in source material
  • Post-hoc verification methods are computationally expensive and do not reflect the model's internal generation dynamics
  • Current internal metrics miss the structural 'jaggedness' of attention that signals when a model is confused or ungrounded
Concrete Example: When LLaMA-2-7B-Chat hallucinates 'December' (not in context), its attention distribution shows sharp, rapid peaks and drops across token positions. A standard entropy metric might look normal, but the signal oscillates wildly compared to the smooth attention of a grounded token.
Key Novelty
Frequency-Aware Attention Analysis
  • Treats the sequence of attention weights over context tokens as a discrete time-series signal indexed by token position
  • Applies signal processing operators (Fourier Transform, Wavelets, Laplacian) to isolate high-frequency components that represent rapid local changes
  • Uses the energy (L2 norm) of these high-frequency components as a feature vector to classify tokens as hallucinated or grounded
Evaluation Highlights
  • Fourier-high features improve AUROC by 6.6% over Lookback-Lens on the RAGTruth summarization task with LLaMA-13B
  • Consistent gains achieved across 3 models (LLaMA-7B, LLaMA-13B, Mistral-7B) and 2 benchmarks (RAGTruth, HalluRAG)
  • Span-level detection improves AUROC by 10.1% on summarization tasks (LLaMA-7B) compared to the strong attention-based baseline Lookback-Lens
Breakthrough Assessment
7/10
Offers a novel, theoretically grounded perspective (signal processing) on attention analysis. While the method is a feature engineering step for a classifier rather than a new architecture, the consistent empirical gains and cross-task robustness are significant.
×