World Model: A learned internal simulator of the environment that predicts future states and rewards, used to train the agent via imagination
Catastrophic Forgetting: The tendency of neural networks to abruptly lose knowledge of previously learned tasks when trained on new information
Reservoir Sampling: A randomized algorithm to maintain a representative sample of a stream of unknown size using fixed memory
RSSM: Recurrent State-Space Model—a specific neural architecture used in Dreamer agents to model environment dynamics using deterministic and stochastic components
FIFO: First-In-First-Out—a buffer strategy that discards the oldest data to make room for new data
DreamerV3: A state-of-the-art model-based RL algorithm that masters diverse domains using a World Model and fixed hyperparameters
Spliced Rollouts: Long episodes cut into smaller fixed-length chunks to allow finer-grained management of storage and sampling
SAC: Soft Actor-Critic—a popular off-policy model-free RL algorithm that maximizes a trade-off between expected return and entropy
Forward Transfer: The ability of an agent to use knowledge from previous tasks to learn new tasks faster or better
Backward Transfer: The improvement in performance on previous tasks resulting from training on new tasks