| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Linear Bandits (Unknown Representation) | Cumulative Regret | Linear Growth (approx 100 at step 100 based on plot trend) | Sublinear Growth (approx 20 at step 100 based on plot trend) | -80 (approx) |
| Dark Room MDP | Success Rate | 0.15 | 0.95 | +0.80 |
| Dark Room MDP | Steps to Goal | Slower convergence | Faster convergence | Positive qualitative improvement |