Core Knowledge: Fundamental cognitive abilities innate to humans or developed early (e.g., object permanence, counting) that underpin advanced reasoning
Sensorimotor Stage: Piaget's first developmental stage where infants develop concepts like object permanence through sensory interaction
Preoperational Stage: Piaget's second stage characterized by the development of symbolic representations
Concrete Operational Stage: Piaget's third stage involving systematic reasoning about numbers, motion, and perspective
Formal Operational Stage: Piaget's fourth stage involving abstract reasoning and intentionality
Circular Evaluation: An evaluation strategy that rotates answer options cyclically to mitigate position bias in multiple-choice questions
Concept Hacking: A proposed evaluation method that manipulates causal image features to test if models rely on shortcuts versus genuine concept understanding
MLLM: Multi-modal Large Language ModelโAI models capable of processing and reasoning over both text and visual inputs
MCQ: Multiple-Choice Question