| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| VPD achieves state-of-the-art results on multiple VQA and reasoning benchmarks compared to the base PaLI-X model and other leading VLMs. | ||||
| OK-VQA | Accuracy | 66.1 | 68.8 | +2.7 |
| A-OKVQA (val) | Accuracy | 64.5 | 65.6 | +1.1 |
| TallyQA (Simple) | Accuracy | 83.6 | 88.6 | +5.0 |
| TallyQA (Complex) | Accuracy | 68.6 | 73.9 | +5.3 |
| Hateful Memes | ROC AUC | 84.9 | 87.1 | +2.2 |
| MMBench | Accuracy | 79.0 | 81.3 | +2.3 |