MIRC: Minimal Recognisable Configuration—the smallest spatial crop or spatiotemporal region of a video that remains identifiable by humans
sub-MIRC: A spatial or spatiotemporal reduction of a video that falls below the threshold of human recognisability
Epic-ReduAct: The dataset introduced in this paper, consisting of spatially reduced and temporally scrambled videos derived from EPIC-KITCHENS-100
Side4Video: A specific state-of-the-art video action recognition model used as the primary AI subject in this study
Ego-centric: First-person point of view (e.g., camera on a person's head), focusing on hands and object interactions
Average Reduction Rate: A metric quantifying the rate at which recognition performance declines as spatial or temporal information is removed
Recognition Gap: The difference in recognition accuracy between human observers and the AI model at specific reduction levels
LTA: Low Temporal Actions—actions that can be recognized primarily from static spatial cues (e.g., holding something)
HTA: High Temporal Actions—actions where motion and temporal evolution are critical for recognition (e.g., shaking)