| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Single-step retrieval comparisons showing HippoRAG outperforming baselines on complex multi-hop datasets. | ||||
| 2WikiMultiHopQA | Recall@2 | 45.0 | 64.8 | 19.8 |
| MuSiQue | Recall@2 | 34.7 | 39.4 | 4.7 |
| MuSiQue | Recall@2 | 26.0 | 39.4 | 13.4 |
| Combination with iterative methods demonstrates complementary gains. | ||||
| 2WikiMultiHopQA | Recall@5 | 57.7 | 78.1 | 20.4 |
| MuSiQue | Recall@5 | 59.2 | 63.4 | 4.2 |