| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Reinforcement learning post-training yields significant improvements across diverse visual perception tasks compared to the supervised fine-tuned baseline. | ||||
| RefCOCO+ | Acc@0.5 | Not reported in the paper | Not reported in the paper | +4.2% |
| PixMo-Count | Performance Score | Not reported in the paper | Not reported in the paper | +17.9% |
| PageOCR | F1-score | 64.7 | 98.1 | +33.4 |
| COCO2017 val | mAP | Not reported in the paper | 31.9 | Not reported in the paper |