VLA: Vision-Language-Action—models that map vision and text directly to robot actions
RL-VLA3: Triple-level asynchronous reinforcement learning architecture proposed in this paper
RDMA: Remote Direct Memory Access—high-speed network communication allowing direct memory access between computers without CPU involvement
Ray: An open-source unified framework for scaling AI and Python applications
FlashAttention: An IO-aware exact attention algorithm that speeds up training and reduces memory usage
Data Packing: Technique of stitching multiple short sequences into a single long sequence to remove padding and maximize GPU utilization
FP8: 8-bit Floating Point—a low-precision data format used to reduce model size and speed up computation
PTQ: Post-Training Quantization—quantizing a model after training is complete, without further fine-tuning
Rollout: The process of an agent interacting with an environment to generate training data (trajectories)
Actor: The component in RL responsible for updating the policy network based on collected data
LeRobot: Open-source embodied AI training framework by Hugging Face
Dynamic Batching: Mechanism to group varying numbers of requests into a batch based on time and size limits to optimize throughput