← Back to Paper List

Implicit Statistical Inference in Transformers: Approximating Likelihood-Ratio Tests In-Context

Faris Chaudhry, Siddhant Gadkari
Department of Computer Science, Imperial College London
arXiv (2026)
Memory Reasoning

📝 Paper Summary

In-Context Learning (ICL) Mechanisms Mechanistic Interpretability Statistical Learning Theory
Transformers trained on dynamic binary classification tasks naturally learn to approximate the Bayes-optimal likelihood-ratio test from context, adapting their internal circuit depth to match the geometric complexity of the task.
Core Problem
While In-Context Learning (ICL) allows Transformers to adapt to new tasks, it is unclear whether they rely on simple similarity heuristics (like nearest neighbors) or construct principled statistical algorithms on the fly.
Why it matters:
  • Understanding the algorithmic ground truth of ICL is essential for safety and interpretability, determining if models are reasoning or merely pattern-matching
  • Existing research focuses on regression with fixed forms; analyzing discrimination tasks allows comparison against the rigorous optimality bounds of the Neyman-Pearson lemma
  • Mechanistic interpretability lacks testbeds where the 'correct' internal algorithm is mathematically known; this work provides such a ground-truth setting
Concrete Example: In a 'shifted mean' task where the decision boundary is linear but off-center, a model relying on simple dot-product similarity (assuming a fixed center) would fail. The proposed analysis checks if the Transformer dynamically infers the shift vector $k$ from context to center the data correctly before classifying.
Key Novelty
ICL as Adaptive Statistical Inference
  • Models the ICL process as a binary hypothesis test, where the optimal policy is mathematically defined by the likelihood-ratio test (LLR)
  • Demonstrates that the model does not use a fixed heuristic but adapts its computation: acting as a 'voting ensemble' for linear tasks and a 'sequential processor' for nonlinear variance tasks
Evaluation Highlights
  • Achieves 83.0% accuracy on nonlinear variance discrimination (Task B), effectively matching the Bayes-optimal oracle performance of 84.0%
  • Internal logits show near-perfect rank alignment with the theoretical log-likelihood ratio for Task B (Spearman ρ = 0.98), despite nonlinear calibration
  • Linear shifted-mean tasks (Task A) show an optimality gap (78.3% vs Oracle 84.6%), utilizing a greedy approximation rather than exact symbolic recovery
Breakthrough Assessment
7/10
Provides a mathematically rigorous framework for interpreting ICL as statistical inference. While the models are small 'toy' Transformers, the mechanistic link between circuit depth and task geometry is a significant insight.
×