| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Attack performance results on RealtimeQA (Mistral-7B) showing RobustRAG maintains accuracy while Standard RAG collapses under attack. | ||||
| RealtimeQA | Exact Match (Attack Size k'=5) | 10.0 | 60.0 | +50.0 |
| RealtimeQA | Exact Match (Clean/No Attack) | 62.0 | 61.0 | -1.0 |
| Generalization results across different datasets using Mistral-7B. | ||||
| Natural Questions (NQ) | Exact Match (Clean) | 44.6 | 47.7 | +3.1 |
| Natural Questions (NQ) | Exact Match (Attack Size k'=1) | 28.5 | 47.5 | +19.0 |