SAR: Synthetic Aperture Radar—an active remote sensing technology that creates images by bouncing radar signals off the earth; unlike optical cameras, it sees through clouds and at night
Bi-temporal: Using two images of the same location taken at different times (pre-disaster and post-disaster) to identify changes
mIoU: mean Intersection over Union—a standard metric for measuring the accuracy of an object detector or segmenter; 100% means perfect overlap with ground truth
cIoU: cumulative Intersection over Union—variant of IoU used here for evaluating referring segmentation performance
Grounding: The ability of a model to link textual concepts (e.g., 'damaged building') to specific pixels or regions in an image
Referring Segmentation: A task where the model must segment a specific object in an image described by a natural language expression
Optical imagery: Standard satellite photos capturing visible light (like a camera)
VLM: Vision-Language Model—AI models trained to understand and generate content based on both images and text
Instruction Tuning: Fine-tuning a model on pairs of (instruction, output) to improve its ability to follow user commands