← Back to Paper List

Modality-Aware and Shift Mixer for Multi-modal Brain Tumor Segmentation

Zhongzhen Huang, Linda Wei, Shaoting Zhang, Xiaofan Zhang
Shanghai Jiao Tong University, Shanghai AI Laboratory
arXiv (2024)
MM Benchmark

📝 Paper Summary

Medical Image Segmentation Multi-modal learning Brain Tumor Segmentation
MASM improves brain tumor segmentation by introducing specific modules to model low-level pairwise modality interactions and high-level complex modality relationships via a shift operation.
Core Problem
Existing multi-modal brain tumor segmentation methods often fail to effectively exchange information across scales or model complex, non-linear relationships between different MRI modalities (T1, T2, etc.).
Why it matters:
  • Accurate tumor segmentation is critical for diagnosis and pre-surgical planning but is error-prone when manual.
  • Simple concatenation (early fusion) overlooks inter-modality relationships essential for clinical diagnosis.
  • Current fusion methods often neglect multi-scale information exchange or high-level spatial-modality fusion.
Concrete Example: A tumor might be visible in T2-weighted images but ambiguous in T1; standard methods might fail to leverage the T2 signal effectively to correct the T1 ambiguity if they simply concatenate inputs without explicit interaction modeling.
Key Novelty
Modality-Aware and Shift Mixer (MASM)
  • Modality-Aware Module: Explicitly models pairwise dependencies between specific modality pairs (e.g., T2 and FLAIR) at low feature levels, mimicking how radiologists combine specific scans.
  • Modality-Shift Module: Swaps image patches between modalities in a mosaic pattern before self-attention at high levels, enabling cross-modality interaction without extra parameters.
  • Adaptive Token Pruning: Uses a decision mask to prune redundant patches in the Modality-Aware module, replacing them with features from aligned modalities.
Architecture
Architecture Figure Figure 2
The U-Net based architecture of MASM, showing the shared encoder, the specific placement of Modality-Aware modules in skip connections, and the Modality-Shift module at the bottleneck.
Evaluation Highlights
  • Outperforms state-of-the-art methods on BraTS 2021 with an average Dice score of 0.912, surpassing the previous best (NestedFormer) by 0.004.
  • Achieves significant computational efficiency, using ~160 GFLOPs compared to UNETR's ~1159 GFLOPs.
  • Demonstrates robust performance across all tumor sub-regions (Enhancing Tumor, Whole Tumor, Tumor Core) in 5-fold cross-validation.
Breakthrough Assessment
7/10
Solid architectural innovation with the Shift module and specific modality pairing. Outperforms SOTA with significantly lower FLOPs, though the absolute metric gain is incremental.
×