| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Trifle outperforms or matches baselines on standard Gym-MuJoCo benchmarks. | ||||
| Gym-MuJoCo (Medium-Expert) | Normalized Score | 88.9 | 91.8 | +2.9 |
| Gym-MuJoCo (Medium) | Normalized Score | 78.0 | 80.4 | +2.4 |
| In stochastic environments, Trifle shows massive gains due to exact marginalization capabilities. | ||||
| Stochastic Hopper-Medium-v2 | Normalized Score | 31.2 | 102.3 | +71.1 |
| Stochastic HalfCheetah-Medium-v2 | Normalized Score | 4.2 | 42.0 | +37.8 |
| Trifle effectively handles safe RL constraints at inference time. | ||||
| Hopper-Medium-Replay (Constrained) | Normalized Score | 26.3 | 83.2 | +56.9 |