← Back to Paper List

Recommendation with Generative Models

Yashar Deldjoo, Zhankui He, Julian McAuley, Anton Korikov, Scott Sanner, Arnau Ramisa, Rene Vidal, Maheswaran Sathiamoorthy, Atoosa Kasrizadeh, Silvia Milano, Francesco Ricci
Polytechnic University of Bari, Italy, Google DeepMind, USA, University of San Diego, California, USA
arXiv (2024)
Recommendation RAG MM P13N Benchmark

📝 Paper Summary

Generative AI in Recommender Systems Foundations of Recommender Systems
The authors propose a paradigm shift from discriminative filtering to Generative Recommender Systems (Gen-RecSys), utilizing models that learn data distributions to generate structured, textual, and multimodal outputs rather than merely ranking existing items.
Core Problem
Traditional discriminative recommender systems focus on ranking fixed catalogs ($P(Y|X)$), limiting their ability to handle cold-start scenarios, generate explanations, or create complex structured outputs like bundles and creative content.
Why it matters:
  • Standard systems struggle with data sparsity and cold-start problems where interaction history is minimal.
  • Discriminative models prioritize accuracy over transparency, failing to provide natural language explanations or reasoning for recommendations.
  • Non-generative systems cannot create new content (e.g., personalized fashion designs or text) or support complex, multi-turn conversational interactions.
Concrete Example: A traditional system can suggest a movie rating, but cannot generate a personalized review explaining *why* a user would like it. In contrast, a generative system (like the example in Figure 2.1) can synthesize a unique cocktail recipe ('Pomberrytini') or write a beer review ('Pours a very dark brown...') based on learned user preferences.
Key Novelty
Generative Recommender Systems (Gen-RecSys) Framework
  • Redefines recommendation from a discriminative task (predicting a label given an item) to a generative task (estimating the probability distribution of items/data given a user/label).
  • Classifies systems by output capability: Structured Outputs (bundles/sequences), Text Generation (explanations/dialogue), and Multimedia Generation (images/audio), utilizing models like VAEs and LLMs.
  • Distinguishes between 'Directly trained models' (learned from scratch on interaction data) and 'Pretrained Generative Models' (adapting Foundation Models like GPT-4 or CLIP via fine-tuning or prompting).
Breakthrough Assessment
8/10
This work establishes a comprehensive taxonomy and foundational theory for the emerging field of Gen-RecSys, unifying diverse approaches (VAEs, LLMs, Diffusion) under a single framework, though it is a survey/monograph rather than a single empirical study.
×