← Back to Paper List

DriveCoT: Integrating Chain-of-Thought Reasoning with End-to-End Driving

Tianqi Wang, Enze Xie, Ruihang Chu, Zhenguo Li, Ping Luo
arXiv.org (2024)
MM Reasoning Benchmark Agent

📝 Paper Summary

End-to-End Autonomous Driving Explainable AI (XAI) for Driving
DriveCoT introduces a challenging driving dataset and a baseline agent that utilizes explicit Chain-of-Thought reasoning steps—such as identifying specific hazards and traffic rules—to improve the interpretability and performance of end-to-end driving systems.
Core Problem
End-to-end driving models typically operate as black boxes, mapping sensors directly to control without explainable intermediate steps, hindering trust and real-world deployment.
Why it matters:
  • Current modular designs have complex hand-crafted rules and error accumulation, while end-to-end methods lack controllability
  • Existing datasets (nuScenes, BDD) primarily contain simple, low-speed scenarios and lack detailed reasoning logic (why a decision was made)
  • Safety-critical deployment requires understanding not just the action (brake) but the cause (red light vs. pedestrian)
Concrete Example: In a high-speed scenario, a standard model might brake without explanation. DriveCoT explicitly reasons: 'Traffic light is red' -> 'Braking required' -> 'Decelerate', distinguishing this from braking for a pedestrian or lead vehicle.
Key Novelty
DriveCoT Dataset & Agent
  • Constructs a dataset using a rule-based expert in the CARLA simulator that records not just control actions but the 'Chain-of-Thought' (CoT) reasoning (e.g., 'stop sign ahead', 'collision risk') used to generate them.
  • Proposes a multi-view video-based agent that predicts these intermediate CoT steps (hazards, traffic rules, vehicle relations) alongside the final trajectory to enforce interpretable decision-making.
Evaluation Highlights
  • Dataset includes 1058 scenarios and 36,000 labeled samples collected at 2 Hz, comparable to the scale of nuScenes but with reasoning labels.
  • Includes a substantial portion of high-speed driving data (above 60 km/h), addressing a gap in existing datasets dominated by low-speed (<30 km/h) scenarios.
  • Demonstrates strong performance in open-loop and closed-loop settings on the CARLA Leaderboard 2.0 benchmarks (specific performance scores not reported in provided text).
Breakthrough Assessment
7/10
Significant contribution in dataset creation for interpretable driving, addressing the lack of reasoning labels in end-to-end driving. The high-speed focus is valuable. The methodology is sound, though the snippet lacks quantitative performance comparisons.
×