| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| CoLaR significantly outperforms latent-based baselines in accuracy while maintaining comparable or better efficiency. | ||||
| Math Reasoning Datasets (Avg) | Accuracy vs Latent Baselines | Not reported in the paper | Not reported in the paper | +14.1% |
| Math Reasoning Datasets (Avg) | Reasoning Chain Length | Not reported in the paper | Not reported in the paper | -53.3% |
| MATH | Accuracy gain | Not reported in the paper | Not reported in the paper | +5.36% |