| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Ablation studies demonstrate the impact of proposed memory optimizations on retrieval and QA performance. | ||||
| LongMemEval | Recall@k | Not reported in the paper | Not reported in the paper | +9.4% |
| LongMemEval | QA Accuracy | Not reported in the paper | Not reported in the paper | +5.4% |
| LongMemEval (Temporal Reasoning) | Recall@k | Not reported in the paper | Not reported in the paper | +11.3% |
| Reading strategy experiments show that how the model processes retrieved context matters significantly. | ||||
| LongMemEval | QA Accuracy | Not reported in the paper | Not reported in the paper | +10.0 (approx) |