| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Model scaling and CoT impact: Larger models and CoT prompting accelerate learning. | ||||
| CoT-ICL Lab (Complex DAG) | Accuracy | 0.45 | 0.98 | +0.53 |
| CoT-ICL Lab (Complex DAG) | Accuracy | 0.35 | 0.98 | +0.63 |
| Impact of token function diversity on learning causal structure. | ||||
| CoT-ICL Lab | Attention to Parents (Structure Learning) | 0.20 | 0.90 | +0.70 |