MDP: Markov Decision Processโa mathematical framework for modeling decision-making where outcomes are partly random and partly under the control of a decision maker.
Offline RL: Reinforcement learning that learns from a static dataset of previously collected experiences without interacting with the environment.
Decision Transformer: An architecture that formulates reinforcement learning as a sequence modeling problem, predicting actions autoregressively like words in a sentence.
Tokenization: The process of converting continuous trajectory data (like robot joint angles or game images) into discrete tokens that a Transformer can process.
Proprioceptive states: Internal states of a robot, such as joint angles and velocities, as opposed to external visual observations.
Modality encoding: Adding learned embeddings to tokens to help the model distinguish between different data types (e.g., distinguishing a state token from an action token).
Zero-shot adaptation: Applying a pretrained model to a new task without any additional gradient updates or training on that specific task.