| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Performance on unseen datasets with LLaVA-1.5-13B shows the benefit of fine-tuning connectors. | ||||
| ScienceQA (img) | Accuracy | 69.05 | 72.48 | +3.43 |
| VizWiz | Accuracy | 50.81 | 52.82 | +2.01 |
| Hallucination analysis on Flickr30k showing Adapter's superiority. | ||||
| Flickr30k | Hallucination Rate | 54.2 | 11.1 | -43.1 |
| Flickr30k | Hallucination Rate | 34.7 | 11.1 | -23.6 |
| Module location ablation study on LLaVA-1.5-7B (ScienceQA). | ||||
| ScienceQA | Accuracy | 65.57 | 69.17 | +3.60 |
| ScienceQA | Accuracy | 69.25 | 70.15 | +0.90 |