← Back to Paper List

Task-Conditioned Routing Signatures in Sparse Mixture-of-Experts Transformers

Mynampati Sri Ranganadha Avinash
Asthra Labs, Independent Researcher
arXiv (2026)
Pretraining Reasoning Factuality QA

📝 Paper Summary

Sparse Mixture-of-Experts (MoE) Mechanistic Interpretability Internal Model Representations
Routing signatures—vectors summarizing expert activation patterns—reveal that sparse MoE transformers systematically route tokens to different experts based on task category, enabling high-accuracy task classification solely from routing telemetry.
Core Problem
The internal routing behavior of sparse MoE models is poorly understood, often treated merely as a load-balancing mechanism rather than a meaningful signal of how computation is allocated across tasks.
Why it matters:
  • Routing is central to interpretability: if tasks use distinct experts, routing offers a tractable view into modular computation
  • Debugging: abnormal routing may signal expert collapse or drift in deployed systems
  • Scientific understanding: determining whether sparse models implement different computation pathways for different tasks is key to understanding neural modularity
Concrete Example: A code-generation prompt might activate a specific set of experts in deep layers, while a creative story prompt activates a different set. Current analysis treats these as random or balanced noise, missing the structural connection between the task type and the physical experts chosen.
Key Novelty
Routing Signatures for Task Analysis
  • Introduces 'routing signatures': compact vector representations that summarize the frequency of expert usage across all layers for a specific prompt
  • Demonstrates that these signatures are not random but cluster strongly by task, exceeding what load-balancing alone would predict
  • Shows that simple linear classifiers can predict the task type (e.g., Code vs. Math) with >92% accuracy using only these routing patterns
Evaluation Highlights
  • Within-category routing similarity (0.8435) significantly exceeds across-category similarity (0.6225), confirming strong task clustering
  • A logistic regression classifier achieves 92.5% accuracy in 4-way task classification using only routing signatures
  • Task separation peaks in deeper layers (around layer 13), suggesting routing specialization increases with network depth
Breakthrough Assessment
7/10
Provides compelling empirical evidence that MoE routing is semantic and task-conditioned, not just a load-balancing artifact. The methodology is simple but the insight is fundamental for MoE interpretability.
×