| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| RLVR experiments on Llama3-U show that using generic benign data (Alpaca) causes severe overrefusal, while the proposed method mitigates this while preserving safety. | ||||
| HEx-PHI | ASR | 84.55 | 9.70 | -74.85 |
| JBench-B | Refusal Rate (RR) | 10 | 67 | +57 |