| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| BeaverTails-V (Binary Setting) | Accuracy | Not reported in the paper | 78 | Not reported in the paper |
| BeaverTails-V (Multi-level Setting) | Accuracy | Not reported in the paper | 85 | Not reported in the paper |
| Safe RLHF-V Evaluation | Safety Improvement | 0 | 34.2 | +34.2 |
| Safe RLHF-V Evaluation | Helpfulness Improvement | 0 | 34.3 | +34.3 |