| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| O2-Searcher performance on the newly constructed O2-QA benchmark against SOTA agents. | ||||
| O2-QA | Performance Score (Qualitative claim) | Not reported in the paper | Not reported in the paper | Not reported in the paper |
| Performance on standard closed-ended benchmarks showing competitiveness with larger models. | ||||
| Closed-ended QA Benchmarks | SOTA status | Not reported in the paper | Not reported in the paper | Not reported in the paper |