← Back to Paper List

Leveraging Large Language Models in Conversational Recommender Systems

Luke Friedman, Sameer Ahuja, David Allen, Zhenning Tan, Hakim Sidahmed, Changbo Long, Jun Xie, Gabriel Schubiner, Ajay Patel, Harsh Lara, Brian Chu, Zexi Chen, Manoj Tiwari
Google Research
arXiv (2023)
Recommendation Agent Memory RL

📝 Paper Summary

Conversational Recommender Systems (CRS) Dialogue Management LLM-based User Simulation
RecLLM is a roadmap and architecture for building a large-scale conversational recommender system using LLMs for unified dialogue management, tractable retrieval, and joint ranking-explanation, validated on the YouTube corpus.
Core Problem
Traditional recommender systems rely on implicit signals (clicks) and lack transparency/control, while existing conversational recommenders often struggle with large-scale corpora, groundedness, and lack of training data.
Why it matters:
  • Current large-scale recommenders often surface clickbait or bias due to reliance on implicit signals rather than explicit user intent
  • LLMs hallucinate and struggle to interface efficiently with industrial-scale item corpora (millions of items) without massive memorization
  • The lack of production CRS products creates a 'cold start' data problem: there are no logs to train sophisticated models
Concrete Example: A user asks for 'videos about fish recipes'. A standard chatbot might hallucinate non-existent video titles. A direct-prediction LLM cannot memorize millions of changing YouTube videos. RecLLM solves this by having the LLM generate a search query (API call) to a retrieval engine, then explaining the results.
Key Novelty
RecLLM (End-to-End LLM-based CRS Architecture)
  • Replaces modular dialogue state tracking with a unified LLM that outputs both natural language and API calls (e.g., 'Request: <query>') as a single language modeling task
  • Proposes a joint ranking/explanation module where an LLM scores items based on metadata summaries and context, generating natural language justifications via chain-of-thought
  • Introduces a controllable user simulator conditioned on session-level variables (profiles) or turn-level variables (intents) to generate synthetic training data for the CRS
Architecture
Architecture Figure Figure 1
Overview of the RecLLM system components and their interaction
Evaluation Highlights
  • Demonstrates qualitative fluency in maintaining context across topic shifts (e.g., switching from 90s hip hop to 80s rock) in mock conversations
  • Proof-of-concept implementation on the full public YouTube video corpus using LaMDA-based models
  • Proposes a Reinforcement Learning from Human Feedback (RLHF) strategy for tuning the dialogue manager using simulated sessions
Breakthrough Assessment
5/10
This is primarily a position paper and roadmap ('proof of concept') rather than a rigorous empirical study. It proposes an architecture and demonstrates feasibility but lacks quantitative benchmarking against SOTA baselines.
×