| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Analysis of safety degradation in VLLMs shows that fine-tuning significantly increases vulnerability to text-based attacks compared to the base LLM. | ||||
| AdvBench (Suffix) | ASR | 5.2 | 39.0 | +33.8 |
| AdvBench (Vanilla) | ASR | 23.2 | 29.6 | +6.4 |
| Results showing the efficacy of VLGuard Mixed Fine-Tuning in reducing harmful responses. | ||||
| VLGuard (Safe-Unsafe) | ASR | 53.6 | 1.1 | -52.5 |
| VLGuard (Unsafe) | ASR | 35.8 | 0.5 | -35.3 |
| VLGuard (Safe-Safe) | Win Rate vs GPT4V | 70.3 | 71.4 | +1.1 |