| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Trace-Free+ consistently improves tool use performance on unseen tools compared to using original documentation. | ||||
| StableToolBench | Solvable SR | 73.2 | 80.3 | +7.1 |
| RestBench (TMDB) | Pass Rate | 66.7 | 77.1 | +10.4 |
| Ablation shows curriculum learning is essential; training only on trace-free data is suboptimal. | ||||
| StableToolBench | Solvable SR | 78.2 | 80.3 | +2.1 |