← Back to Paper List

Preference Discerning with LLM-Enhanced Generative Retrieval

F Paischer, L Yang, L Liu, S Shao, K Hassani, J Li…
Institute for Machine Learning, JKU Linz, AI at Meta, University of Wisconsin-Madison
arXiv, 12/2024 (2024)
Recommendation P13N MM Benchmark

📝 Paper Summary

Sequential Recommendation Generative Retrieval LLM-based Recommendation
The paper introduces 'preference discerning,' a paradigm where generative recommendation models are explicitly conditioned on natural language user preferences to dynamically steer recommendations without retraining.
Core Problem
Current sequential recommendation models rely solely on past interaction history, making them unable to dynamically adapt to changing user interests (e.g., new hobbies) without retraining.
Why it matters:
  • Static models reinforce echo chambers by continuing to recommend similar content even when user intent shifts
  • Adapting to life changes (e.g., career transition) currently requires slow model retraining rather than immediate response
  • Existing methods lack a mechanism for users to explicitly steer recommendations using natural language instructions
Concrete Example: A user who historically watched entertainment videos starts learning a new skill (e.g., coding). Current models continue recommending viral entertainment videos instead of tutorials because they only see the long history of entertainment, ignoring the recent shift in intent.
Key Novelty
Preference Discerning (Mender)
  • Decouples preference extraction from recommendation: uses an LLM to distill history into concise text preferences, then conditions the recommender on these text preferences
  • Fuses semantic item IDs (collaborative filtering) with pre-trained language encoders (semantic understanding) to allow direct steering via natural language
  • Introduces a 5-axis benchmark to evaluate steerability, including sentiment following and fine-grained control
Architecture
Architecture Figure Figure 2
The architecture of Mender (Multimodal Preference Discerner). It shows the two-stream input: interaction history and natural language preferences.
Evaluation Highlights
  • Mender outperforms TIGER (state-of-the-art generative retrieval) by ~20-30% on preference-based recommendation tasks across Sports and Beauty datasets
  • Achieves superior sentiment following: effectively avoids items associated with negative preferences while retrieving positive ones, unlike standard sequential baselines
  • Demonstrates zero-shot steerability: can be guided by preferences not seen during training to recommend semantically related items
Breakthrough Assessment
7/10
Strong conceptual contribution in making recommenders steerable via language. The benchmark is comprehensive. Performance gains are significant, though the reliance on generated preferences adds pipeline complexity.
×