Embodied Agent: An AI system controlling a physical or simulated body (robot) to perform tasks in an environment
Low-level actions: Atomic commands directly executable by robots, such as specific movement distances (meters) or joint rotations (degrees)
High-level actions: Abstract, semantic commands composed of multiple low-level primitives, like 'Find apple' or 'Pick up mug'
POMDP: Partially Observable Markov Decision Process—a mathematical framework for decision-making where the agent cannot directly see the entire state of the world
Ego-centric vision: Visual input captured from the robot's own perspective (first-person view)
Kinematic: Relating to the motion of points, bodies, and systems without considering the forces that cause them
AI2-THOR: A photorealistic interactive environment for embodied AI agents
Habitat: A high-performance 3D simulator for training virtual robots
YOLO: You Only Look Once—a real-time object detection system used here to provide bounding boxes to the agent
Euler angles: Three angles (roll, pitch, yaw) used to describe the orientation of a rigid body