| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| D-PPO generally outperforms PPO across most tested Atari environments, with significant gains in Breakout and Enduro. | ||||
| Enduro | Average Return | 194.552 | 391.222 | +196.67 |
| Breakout | Average Return | 79.615 | 135.457 | +55.842 |
| DemonAttack | Average Return | 4426.82 | 6184.817 | +1757.997 |
| Kangaroo | Average Return | 1824.8 | 2792.167 | +967.367 |
| Boxing | Average Return | 90.923 | 83.171 | -7.752 |