GSU: General Search Unit—the first stage of a recommendation system that retrieves a small candidate set from a massive item pool
ESU: Exact Search Unit—the second stage that precisely ranks the candidate set using complex models
Semantic IDs: Discrete codes (tokens) representing items, derived from quantizing dense embeddings, allowing LLM-like sequence modeling for recommendations
Res-Kmeans: Residual K-means—a quantization method that approximates vectors by recursively clustering residuals (errors) from the previous step
FSQ: Finite Scalar Quantization—a method that quantizes vectors by projecting them into a fixed grid, ensuring uniform distribution and avoiding codebook collapse
Swing: An item-to-item collaborative filtering algorithm that calculates similarity based on the overlap of users who interacted with both items (User-Item-User paths)
Exposure Bias: The tendency for recommendation systems to suggest popular items simply because they are shown more often, not because they are most relevant
NTP: Next Token Prediction—the standard training objective for generative language models
MoE: Mixture of Experts—a neural network architecture that activates only a subset of sub-networks (experts) for each input to save compute
SIDs: Semantic IDs—see Semantic IDs definition above