PDDL: Planning Domain Definition Language—a standard encoding for defining environments, operators, and goals in symbolic planning.
TAMP: Task and Motion Planning—a framework integrating high-level symbolic reasoning (what to do) with low-level motion generation (how to move).
Operator Discovery: The process of identifying new action schemas (operators) when existing ones are insufficient to reach a goal.
PPO: Proximal Policy Optimization—a popular reinforcement learning algorithm used for training the robot's control policies.
Dense Reward: A continuous feedback signal provided at every step (e.g., distance to target) to guide learning, as opposed to a sparse reward given only upon success.
Reward Shaping: Modifying the reward function to include additional guidance, helping the agent learn faster.
MimicGen: A data generation system for robotic manipulation, used here as the simulation environment.
Grounding: Connecting abstract symbols (like 'open-drawer') to concrete physical states or actions in the real world.
Plannable States: States from which a known symbolic planner can find a sequence of operators to reach the goal.