← Back to Paper List

LLM Hallucination Detection: A Fast Fourier Transform Method Based on Hidden Layer Temporal Signals

Jinxin Li, Gang Tu, ShengYu Cheng, Junjie Hu, Jinting Wang, Rui Chen, Zhilong Zhou, Dongbo Shan
Huazhong University of Science and Technology
arXiv (2025)
Factuality QA Benchmark

📝 Paper Summary

Hallucination detection Internal state analysis / Interpretability
HSAD detects LLM hallucinations by treating the autoregressive generation process as a temporal signal and identifying anomalies in the frequency domain using Fast Fourier Transform on hidden states.
Core Problem
Existing hallucination detection methods either rely on external knowledge bases (limited coverage) or static hidden-state analysis (fails to capture temporal reasoning dynamics).
Why it matters:
  • Hallucinations undermine credibility and restrict LLM deployment in high-stakes scenarios like medical or legal advice
  • Fact-checking against external bases is computationally expensive and limited by the freshness of the knowledge base
  • Static analysis misses the 'thought process' evolution, which cognitive neuroscience suggests contains signals of fabrication
Concrete Example: When an LLM fabricates an answer about a historical event, its internal confidence and attention patterns fluctuate over time differently than when it recalls a fact. Static analysis looks at a single snapshot, missing this fluctuation, while HSAD captures the 'wobble' in the signal across layers.
Key Novelty
Hidden Signal Analysis-based Detection (HSAD)
  • Models the LLM's forward pass across layers as a 'temporal' signal, analogous to biological neural signals changing over time during cognitive conflict
  • Applies Fast Fourier Transform (FFT) to these cross-layer hidden states to extract spectral features (frequencies)
  • Uses the strongest non-DC frequency components to train a lightweight classifier that distinguishes between factual and hallucinatory generation paths
Architecture
Architecture Figure Figure 1
Conceptual framework of HSAD. It illustrates the analogy between human cognitive signals and LLM hidden states, showing the extraction of hidden vectors across layers, construction of a temporal signal, FFT transformation, and final classification.
Evaluation Highlights
  • Achieves highest AUROC across 4 datasets (TruthfulQA, TriviaQA, SciQ, NQ Open), outperforming baselines like SAPLM and INSIDE
  • +13.1 percentage points improvement in AUROC on TruthfulQA using LLaMA-3.1-8B compared to the SAPLM baseline
  • Demonstrates that observing the signal at the 'Answer End' position yields significantly better detection than observing at the question start or middle
Breakthrough Assessment
7/10
Novel application of signal processing (FFT) to internal model states for hallucination detection. Strong empirical results, though primarily an interpretability/detection technique rather than a new architectural paradigm.
×