| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Main comparison on Tool-DE benchmark showing Tool-Embed-4B outperforms general-purpose embeddings and non-expanded baselines. | ||||
| Tool-DE | NDCG@10 | 46.21 | 52.23 | +6.02 |
| Tool-DE | Recall@10 | 57.52 | 63.13 | +5.61 |
| Reranking results show further improvements when applying Tool-Rank on top of retrieval results. | ||||
| Tool-DE | NDCG@10 | 52.23 | 56.44 | +4.21 |
| Ablation study demonstrates the impact of training on expanded documents versus original documents. | ||||
| Tool-DE | NDCG@10 | 46.80 | 52.23 | +5.43 |