ReCon: A contrastive learning framework for 3D representation that uses reconstruction as guidance
6-DoF: Six Degrees of Freedom—refers to the freedom of movement of a rigid body in three-dimensional space (position x,y,z and orientation pitch,yaw,roll)
Hungarian algorithm: A combinatorial optimization algorithm that solves the assignment problem, used here to optimally match 3D queries to 2D view features
Point Cloud: A set of data points in space (usually 3D) representing the external surface of an object
Distillation: The process of transferring knowledge from a large teacher model (or rich modality like multi-view images) to a student model (the 3D encoder)
Objaverse: A massive dataset of 3D objects used for training
DETR: DEtection TRansformer—an object detection model that uses bipartite matching loss and transformers
APE: Absolute Position Encoding—embeddings added to representations to retain spatial coordinate information