| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Results showing improvement of PretrainZero (Reinforcement Pretraining stage only) over the vanilla Base model. | ||||
| MMLU-Pro | Score | Not explicitly reported in the paper | Not explicitly reported in the paper | +8.43 |
| SuperGPQA | Score | Not explicitly reported in the paper | Not explicitly reported in the paper | +5.96 |
| Math Average | Score | Not explicitly reported in the paper | Not explicitly reported in the paper | +10.60 |
| Results after applying general RLVR Post-Training, comparing the PretrainZero initialization vs. standard Base initialization. | ||||
| MMLU-Pro | Score | Not explicitly reported in the paper | Not explicitly reported in the paper | +2.35 |
| SuperGPQA | Score | Not explicitly reported in the paper | Not explicitly reported in the paper | +3.04 |
| Math Average | Score | Not explicitly reported in the paper | Not explicitly reported in the paper | +2.81 |