| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Comparison of Peak Performance (Pass@1) and Training Costs (Time/Steps) on 7B Models. | ||||
| Average across 5 benchmarks | Wall-clock Time (hours) | 28.28 | 12.22 | -16.06 |
| Average across 5 benchmarks | Peak Training Step | 350 | 150 | -200 |
| AIME 2024 | Pass@1 | 0.554 | 0.560 | +0.006 |
| Comparison against DAPO (Dynamic Sampling) on 8B Models, highlighting the disconnect between step reduction and time reduction. | ||||
| Average across 5 benchmarks | Wall-clock Time (hours) | 93.75 | 17.40 | -76.35 |
| Average across 5 benchmarks | Time per Step (s) | 2109 | 348 | -1761 |
| MATH-500 | Pass@1 | 0.889 | 0.888 | -0.001 |