| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Linear MDP experiments show VIPeR matches the performance of explicit LCB methods (PEVI) asymptotically while outperforming greedy baselines. | ||||
| Synthetic Linear MDP (H=20) | Sub-optimality | 100.0 | 0.1 | -99.9 |
| Neural Contextual Bandit experiments demonstrate superior performance and computational efficiency over NeuralLCB. | ||||
| Neural Contextual Bandits (MNIST) | Sub-optimality | 0.15 | 0.02 | -0.13 |
| Neural Contextual Bandits | Runtime (seconds) | 40.0 | 0.01 | -39.99 |