Evaluation Setup
Drone racing through a sequence of 7 gates in simulation (Flightmare) and real-world (Agilicious quadrotor)
Benchmarks:
- Flightmare Simulation (Gate traversal / Racing)
- Real-World Flight (Gate traversal / Racing) [New]
Metrics:
- Success Rate (SR)
- Lap Time
- Crash Rate
- Statistical methodology: Results averaged over 5 seeds in simulation
Key Results
| Benchmark |
Metric |
Baseline |
This Paper |
Δ |
| Simulation comparisons showing the inability of model-free RL (PPO) to learn from pixels compared to the proposed model-based approach. |
| Flightmare Simulation |
Success Rate |
0 |
1.0 |
+1.0
|
| Flightmare Simulation |
Success Rate |
1.0 |
1.0 |
0.0
|
| Ablation on observation types confirming raw pixels are viable compared to simplified masks. |
| Flightmare Simulation |
Lap Time (s) |
5.5 |
5.8 |
+0.3
|
Main Takeaways
- Model-free RL (PPO) completely fails to learn the racing task directly from pixels, validating the need for model-based approaches.
- The learned policy exhibits emergent 'active perception' behaviors, orienting the camera toward gates without explicit reward shaping.
- Zero-shot sim-to-real transfer is successful, enabling the drone to fly through gates in the real world using only onboard camera processing.
- Intermediate representations (like masks) simplify learning but are not strictly necessary with DreamerV3, which can handle raw RGB complexity.