Generative Retrieval: A recommendation paradigm where the model directly generates identifiers (tokens) of the target items instead of selecting them from a candidate list
Vector Quantization: The process of mapping continuous embedding vectors to a finite set of discrete codes (cluster centroids)
Playlist Co-occurrence: A signal indicating which songs frequently appear together in user-created playlists, capturing cultural and contextual similarity
LLM: Large Language Modelโa type of AI trained on vast text data to understand and generate human language
Modalities: Different types of data representing music: Audio (sound), Lyrics (text), Metadata (facts), Semantic tags (mood/genre), and Playlist patterns
MusicFM: A pre-trained foundation model used here to extract audio feature embeddings
NV-Embed-v2: A state-of-the-art text embedding model used here for lyrics, metadata, and semantic tags