| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Results on QuALITY showing ReadAgent outperforms both retrieval baselines and the Full Text baseline (despite Full Text fitting in context). | ||||
| QuALITY | Accuracy | 85.83% | 87.17% | +1.34% |
| QuALITY | Accuracy | 71.32% | 84.13% | +12.81% |
| Results on NarrativeQA (Gutenberg) demonstrating performance on very long contexts (books). | ||||
| NarrativeQA (Gutenberg) | ROUGE-L | 0.197 | 0.226 | +0.029 |
| NarrativeQA (Gutenberg) | LLM Rating-1 (Strict) | 50.62% | 59.98% | +9.36% |
| Results on QMSum (Meeting Transcripts) where Sequential lookup shows significant advantage. | ||||
| QMSum | ROUGE-L | 16.58 | 21.15 | +4.57 |