| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Predictive task results comparing Knowledge Distillation (KD) against Supervised Fine-Tuning (SFT) relative to the Foundation Model (FM) baseline. KD preserves performance significantly better. | ||||
| Internal Ranking Tasks | AUC Delta (%) | 0.00 | -0.06 | -0.06 |
| Internal Ranking Tasks | AUC Delta (%) | 0.00 | -0.62 | -0.62 |
| Internal Ranking Tasks | AUC Delta (%) | 0.00 | -0.15 | -0.15 |
| Internal Ranking Tasks | AUC Delta (%) | 0.00 | -1.21 | -1.21 |
| Reasoning task improvements using the proposed distillation recipes on open source models. | ||||
| AIME 2024 | Performance Improvement | 0 | 20 | +20 |
| Serving Latency | Prefill Speedup | 0 | 28 | +28 |