| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Main results on API-Bank showing TGR improves upon base embeddings (MiniLM and ToolBench-IR). | ||||
| API-Bank | Recall@5 | 0.697 | 0.762 | +0.065 |
| API-Bank | NDCG@5 | 0.722 | 0.784 | +0.062 |
| API-Bank | Pass Rate@5 | 0.463 | 0.540 | +0.077 |
| Main results on ToolBench (I1 level) showing consistent but smaller gains compared to API-Bank. | ||||
| ToolBench | Recall@5 | 0.548 | 0.563 | +0.015 |
| ToolBench | Pass Rate@5 | 0.260 | 0.337 | +0.077 |
| Ablation study comparing discriminator-based graph (+TGR-d) vs manually annotated graph (+TGR-m) on API-Bank. | ||||
| API-Bank | Recall@5 | 0.650 | 0.672 | +0.022 |