| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Main results on 7B models show QFFT maintains accuracy while drastically cutting token usage compared to standard SFT. | ||||
| MATH500 (7B) | Accuracy | 80.8 | 80.2 | -0.6 |
| MATH500 (7B) | Tokens | 5300 | 2800 | -2500 |
| GSM8K (7B) | Tokens | 1700 | 400 | -1300 |
| MATH (Noise Level IV) | Accuracy | 0.4 | 78.6 | +78.2 |
| MMLU-Pro (32B) | Accuracy | 64.9 | 73.6 | +8.7 |
| MATH500 (7B) | RAK | 3.5 | 47.7 | +44.2 |