| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| LMTX demonstrates superior performance over non-LLM baselines across multiple large-scale datasets, particularly in Precision@1. | ||||
| LF-Wikipedia-500K | P@1 | 31.25 | 41.05 | +9.80 |
| AmazonCat-13K | P@1 | 63.95 | 87.89 | +23.94 |
| EURLex-4k | P@1 | 42.92 | 47.28 | +4.36 |
| LF-WikiSeeAlso-320K | P@1 | 26.36 | 26.56 | +0.20 |
| Comparison against LLM-based inference (ICXML) shows LMTX achieves higher precision with much lower inference cost. | ||||
| EURLex-4k | P@1 | 19.14 | 47.28 | +28.14 |