| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Performance on conversational benchmarks showing improvements over baselines. (Note: Baseline absolute values for standard DPO were not explicitly extracted from the text, so only Curri-DPO values are listed here where explicit comparisons are made in text). | ||||
| MT-Bench | Score (1-10) | Not reported in the paper | 7.43 | Not reported in the paper |
| Vicuna Bench | Win Rate | Not reported in the paper | 90.7% | Not reported in the paper |
| WizardLM | Win Rate | Not reported in the paper | 87.1% | Not reported in the paper |