Evaluation Setup
Dense 3D reconstruction, depth estimation, and pose estimation across various datasets.
Benchmarks:
- ScanNet (Indoor 3D Reconstruction)
- ScanNet++ (Indoor 3D Reconstruction)
- 7-Scenes (Camera Pose Estimation)
- Bontar (Dynamic Scene Reconstruction)
Metrics:
- ATE (Absolute Trajectory Error)
- AbsRel (Absolute Relative Error for depth)
- F-score (Geometry quality)
- Accuracy (Acc)
- Statistical methodology: Not explicitly reported in the paper
Main Takeaways
- Point3R outperforms implicit memory methods (Spann3R, CUT3R) on long-sequence reconstruction by maintaining spatial context.
- The explicit spatial memory allows the model to handle dynamic scenes effectively by updating spatial features at specific locations.
- The method achieves competitive results with pair-wise methods (DUSt3R) while being significantly more efficient due to the streaming nature (avoiding N^2 matching or global optimization).