← Back to Paper List

Uncertainty Estimation for Safety-critical Scene Segmentation via Fine-grained Reward Maximization

Hongzheng Yang, Cheng Chen, Yueyao Chen, Markus Scheppach, Hon-Chi Yip, Qi Dou
The Chinese University of Hong Kong, Harvard Medical School & Massachusetts General Hospital, University Hospital of Augsburg
Neural Information Processing Systems (2023)
RL MM Factuality

📝 Paper Summary

Uncertainty Estimation Medical Image Segmentation Reinforcement Learning for Vision
FGRM fine-tunes segmentation models using reinforcement learning with a calibration-based reward and a Fisher Information-weighted parameter update scheme to explicitly align model confidence with prediction risk.
Core Problem
Existing uncertainty estimation methods rely on indirect task losses (like Cross-Entropy) rather than explicit uncertainty metrics during training, leading to miscalibrated confidence and unclear priors in safety-critical scenarios.
Why it matters:
  • In safety-critical domains like robotic surgery, tolerance for prediction risk is extremely low, making reliable uncertainty estimation as crucial as accuracy
  • Current approaches often lack explicit guidance for calibrating prediction risk, resulting in over-confidence on out-of-distribution data or ambiguous tissue boundaries
  • Standard reinforcement learning updates are uniform across parameters, which is suboptimal for dense segmentation tasks where parameter importance varies significantly
Concrete Example: In a laparoscopic surgery scene, a standard segmentation model might classify an ambiguous boundary between the liver and fat with high confidence (low uncertainty) because it was trained only on segmentation accuracy (Dice), potentially leading a surgical robot to make a dangerous cut.
Key Novelty
Fine-Grained Reward Maximization (FGRM)
  • Treats uncertainty estimation as a reward maximization problem where the reward is directly the calibration metric (e.g., negative Expected Calibration Error), rather than a proxy loss
  • Uses the diagonal of the Fisher Information Matrix to weigh parameter updates, assigning larger updates to parameters that are more important for the model's output distribution
Architecture
Architecture Figure Figure 1
Overview of the FGRM framework, illustrating the pre-training and RL fine-tuning phases
Evaluation Highlights
  • Reduces In-Distribution Expected Calibration Error (ECE) by ~2.1 points (11.74 -> 9.63) compared to state-of-the-art NatPN on the Laparoscopic Cholecystectomy dataset
  • Improves Out-of-Distribution detection (Pixel Ratio) by +0.29 over NatPN and +0.93 over Deep Ensemble on the LC dataset
  • Maintains real-time inference speed (0.052ms per image) while significantly outperforming ensemble methods (0.201ms) on calibration metrics
Breakthrough Assessment
8/10
Novel application of RL to fine-tune uncertainty calibration directly. The Fisher Information-based update scheme addresses the difficulty of applying RL to dense prediction tasks. Strong empirical results on medical datasets.
×