Semantic Orientation: A unit vector representing a specific, language-grounded direction of an object (e.g., 'handle direction') independent of a global reference frame
6-DoF: Six Degrees of Freedom—referring to the freedom of movement of a rigid body in three-dimensional space: translation (x, y, z) and rotation (roll, pitch, yaw)
PointSO: The authors' proposed cross-modal 3D Transformer model that predicts semantic orientation vectors from point clouds and text
VLM: Vision-Language Model—AI models that can process both images and text to perform reasoning tasks
SimplerEnv: A simulation environment for evaluating robot manipulation policies
Open6DOR: A benchmark for evaluating 6-DoF object rearrangement tasks
OrienText300K: The authors' proposed large-scale dataset of 3D objects annotated with language-grounded orientation vectors
SAM: Segment Anything Model—a foundation model for image segmentation
CoT: Chain-of-Thought—a prompting technique where the model generates intermediate reasoning steps before the final answer