CRL: Coarse-to-fine Reinforcement Learning—a framework where agents iteratively zoom into continuous action spaces
CQN: Coarse-to-fine Q-Network—the specific value-based algorithm implementation of the CRL framework
Actor-Critic: RL architecture with separate policy (actor) and value (critic) networks; often unstable in continuous control
Value-based RL: RL methods (like DQN) that learn Q-values for actions and select the best one, typically more stable but naturally discrete
BC: Behavior Cloning—supervised learning from expert demonstrations
C2F-ARM: A prior coarse-to-fine method specific to next-best-pose agents; CQN is a generalization of this to continuous joint control
Distributional Critic: A critic that predicts the full distribution of returns (probabilities over value ranges) rather than just the mean expectation
Polyak averaging: A technique to update target network parameters slowly (moving average) to stabilize training
Dueling Network: A neural network architecture that separates state value estimation from action advantage estimation
SiLU: Sigmoid Linear Unit—an activation function used in the neural networks
AdamW: An optimization algorithm with weight decay fix
RLBench: A benchmark environment for robot learning tasks
DMC: DeepMind Control Suite—a standard benchmark for continuous control physics tasks