| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Average across 6 benchmarks | Accuracy | Not reported in the paper | Not reported in the paper | +2.0 |
| Average across 6 benchmarks | Accuracy | Not reported in the paper | Not reported in the paper | +1.2 |
| Average across 6 benchmarks | Accuracy | Not reported in the paper | Not reported in the paper | +1.3 |
| Prompt Utilization Analysis | Effective Prompt Usage | Not reported in the paper | Not reported in the paper | +10% |