← Back to Paper List

Dissecting Chronos: Sparse Autoencoders Reveal Causal Feature Hierarchies in Time Series Foundation Models

Anurag Mishra
Rochester Institute of Technology
arXiv (2026)
Pretraining Benchmark

📝 Paper Summary

Mechanistic Interpretability Time Series Forecasting
Applying sparse autoencoders to the Chronos-T5 model reveals that mid-layer change-detection features are causally critical for forecasting, whereas final-layer semantic features are redundant or even detrimental.
Core Problem
Time Series Foundation Models (TSFMs) are increasingly deployed in high-stakes domains, yet their internal decision-making processes remain opaque 'black boxes'.
Why it matters:
  • Lack of transparency in high-stakes forecasting (e.g., energy, finance) creates risk
  • Prior interpretability methods for time series were post-hoc (saliency maps) rather than mechanistic, failing to explain internal computation
  • It is unknown whether TSFMs rely on robust causal features or spurious correlations
Concrete Example: When predicting a time series with a sudden level shift, it is unclear if the model detects the shift explicitly or relies on periodic patterns. This study shows that ablating a single mid-layer feature (ID 4616) causes a massive forecast error (CRPS spike of 38.61), proving the model's 'catastrophic dependence' on specific internal change-detection circuits.
Key Novelty
First application of Sparse Autoencoders (SAEs) to a Time Series Foundation Model
  • Train TopK Sparse Autoencoders on the internal activations of Chronos-T5 to decompose dense representations into interpretable features
  • Map these features to a taxonomy of temporal concepts (trends, seasonality, level shifts) using synthetic data correlations
  • Validate feature importance via causal ablation (zeroing out features) to measure their direct impact on forecast accuracy (CRPS)
Evaluation Highlights
  • 100% of 392 ablated features produced positive CRPS degradation, confirming universal causal relevance of SAE features
  • Mid-encoder (Block 11) features are the most critical, with a single feature causing a +38.61 CRPS degradation (max-to-median ratio of 30.5x)
  • Final-encoder (Block 23) progressive ablation paradoxically improves forecast quality (CRPS decreases from 3.62 to 2.73), suggesting overfitting or redundancy
Breakthrough Assessment
8/10
Significant methodology transfer (SAEs to Time Series) with counterintuitive findings about layer hierarchy (mid-layer bottleneck). Strong causal validation, though limited to one model and benchmark.
×