| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Physics Supernova significantly outperforms the base LLM and reaches gold medal standards on IPhO 2025. | ||||
| IPhO 2025 | Total Theory Score | 15.6 | 23.5 | +7.9 |
| IPhO 2025 | Rank (Lower is better) | 30 | 14 | -16 |
| Ablation studies demonstrate the critical contribution of both the Image Analyzer and Answer Reviewer tools. | ||||
| IPhO 2025 | Total Theory Score | 21.6 | 23.5 | +1.9 |
| IPhO 2025 | Total Theory Score | 18.8 | 23.5 | +4.7 |