| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Ablation study on LLaMA-7B to determine optimal placement and average accuracy across math reasoning datasets. | ||||
| Math Reasoning Average | Accuracy | 59.5 | 61.7 | +2.2 |
| Math Reasoning Average | Accuracy | 59.5 | 60.0 | +0.5 |
| Math Reasoning Average | Accuracy | 42.0 | 61.7 | +19.7 |