| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Token distribution analysis showing the minimal impact of alignment tuning on decoding choices. | ||||
| Token Shift Analysis | Unshifted Ratio (%) | 0 | 77.7 | +77.7 |
| Token Shift Analysis | Top-3 Overlap (%) | 0 | 92.2 | +92.2 |
| Comparative performance of URIAL against SFT and RLHF baselines on the just-eval-instruct dataset. | ||||
| just-eval-instruct | Average Score (1-5) | 4.44 | 4.63 | +0.19 |
| just-eval-instruct | Average Score (1-5) | 4.67 | 4.74 | +0.07 |
| just-eval-instruct | Average Score (1-5) | 3.18 | 4.33 | +1.15 |