ECoT: Embodied Chain-of-Thought—a method where robots generate intermediate textual reasoning (plans, subgoals) before acting
VLA: Vision-Language-Action models—foundation models that map vision and language directly to robot controls
temporal locality: The property that high-level reasoning (like the overall task plan) rarely changes between consecutive control steps
continuous batching: A scheduling technique that dynamically inserts new requests into a running batch as soon as others finish, maximizing GPU utilization
vLLM: A high-throughput library for LLM inference and serving that implements PagedAttention and continuous batching
OpenVLA: A specific open-source VLA model architecture based on Prismatic and Llama 2, used here as the base policy
LIBERO: A benchmark suite for lifelong robot learning evaluation, testing generalization across spatial, object, and goal variations
Action Faithfulness: A metric measuring how much the final action depends on specific reasoning steps, calculated as the L1 distance between the final action and an action predicted early in the chain
LoRA: Low-Rank Adaptation—a parameter-efficient fine-tuning technique that freezes the base model and trains small rank-decomposition matrices