| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| In-Distribution (ID) performance comparison shows TrajAD significantly outperforming general-purpose zero-shot baselines, particularly in localization. | ||||
| TrajBench | Macro-F1 | 70.43 | 81.81 | +11.38 |
| TrajBench | Joint Exact Match (JEM) | 5.54 | 53.75 | +48.21 |
| TrajBench | Recall | 28.46 | 88.16 | +59.70 |
| Out-of-Distribution (OOD) transfer experiments where Embodied AI is held out during training. | ||||
| TrajBench (Target: Embodied AI) | Macro-F1 | 70.89 | 83.09 | +12.20 |
| TrajBench (Target: Embodied AI) | Joint Exact Match (JEM) | 11.48 | 38.25 | +26.77 |