| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| ViGoRL consistently outperforms baselines on spatial reasoning tasks, showing the value of grounded RL. | ||||
| SAT-2 | Accuracy | 44.6 | 57.5 | +12.9 |
| BLINK | Accuracy | 56.5 | 58.5 | +2.0 |
| V*Bench | Accuracy | Not reported in the paper | 86.4 | Not reported in the paper |