| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Magnet-14B-mDPO achieves state-of-the-art results among open models on BFCL-v3, significantly improving over its base model and surpassing its teacher. | ||||
| BFCL-v3 (Overall) | Success Rate | 66.09 | 68.01 | +1.92 |
| BFCL-v3 (Multi-turn) | Success Rate | 13.64 | 46.14 | +32.50 |
| ToolQuery | Success Rate | 71.70 | 73.30 | +1.60 |
| Ablation studies confirm the value of mDPO and the graph-based data synthesis components. | ||||
| BFCL-v3 (Overall) | Success Rate | 66.30 | 68.01 | +1.71 |