Evaluation Setup
Controlled experiments on datasets with known spurious/shortcut features to measure transition timing
Benchmarks:
- Modular Arithmetic (Algorithmic reasoning)
- CIFAR-10 (Spurious) (Image classification with added shortcuts) [New]
- CelebA (Face attribute classification)
- Waterbirds (Robustness to background correlation)
Metrics:
- Clean Accuracy
- Transition Delay (epochs/steps)
- Parameter Norm
- Statistical methodology: R-squared regression analysis for validating scaling laws
Key Results
| Benchmark |
Metric |
Baseline |
This Paper |
Δ |
| Modular Arithmetic |
R^2 (Transition Delay vs Theory) |
0 |
0.97 |
+0.97
|
| CIFAR-10 (Spurious) |
Clean Accuracy |
10.0 |
78.0 |
+68.0
|
| CIFAR-10 |
Norm Ratio (V_sc / V_st) |
1.0 |
37.0 |
+36.0
|
| CelebA |
Norm Ratio (V_sc / V_st) |
1.0 |
3.0 |
+2.0
|
Main Takeaways
- Transition delay is governed by the ratio of shortcut norm to structured norm and the effective regularization strength.
- Three distinct regimes exist: Weak regularization (permanent shortcut), Intermediate (delayed transition/grokking), Strong (no learning).
- ResNet18 with BatchNorm exhibits the same peak-then-decay norm dynamics as unnormalized models.
- Waterbirds dataset demonstrates the framework's boundary: norm dynamics transfer, but representational transition fails due to lack of clean norm separation.