| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Multi-hop RAG | Accuracy | Not reported as a specific number in text summary (qualitative comparison) | Not reported as a specific number in text summary | Not reported as a specific number in text summary |
| Multi-hop RAG | Token Usage | 106 million tokens (for 100 questions) | Not explicitly reported (implied significantly lower) | Not explicitly reported |