← Back to Paper List

Language-Based User Profiles for Recommendation

Joyce Zhou, Yijia Dai, Thorsten Joachims
Cornell University
arXiv (2024)
Recommendation P13N Memory

📝 Paper Summary

Recommender Systems Large Language Models (LLMs) for Recommendation
The paper proposes replacing high-dimensional latent vectors with human-readable natural language summaries as user profiles, generated and processed by Large Language Models (LLMs) for transparent recommendation.
Core Problem
Conventional recommendation methods like matrix factorization represent users as high-dimensional vectors, which are unintelligible to humans, difficult to edit (steer), and often perform poorly in cold-start settings.
Why it matters:
  • Lack of transparency requires post-hoc explanation methods rather than offering intrinsic interpretability
  • Users cannot directly correct or update their profiles to change recommendations (lack of steerability)
  • Standard methods struggle to make accurate predictions when user interaction history is sparse (cold-start)
Concrete Example: A matrix factorization model might represent a user as a vector `[0.1, -0.5, ...]` which is meaningless to the user. In contrast, this system generates text like 'User enjoys sci-fi movies but dislikes horror,' which the user can read and potentially edit.
Key Novelty
Language-Based Factorization Model (LFM)
  • Replaces latent vector embeddings with a compact natural language summary of the user's interests
  • Uses an Encoder LLM to synthesize rating history into a text profile
  • Uses a Decoder LLM to read the text profile and perform downstream tasks like rating prediction or pairwise preference
Architecture
Architecture Figure Figure 1
Illustration of the Language-Based Factorization Model (LFM) pipeline.
Evaluation Highlights
  • LFM performs competitively with direct LLM prediction (no summary) across rating, preference, and choice tasks, showing that compact text profiles capture necessary information
  • LFM outperforms standard Matrix Factorization (NMF) in cold-start settings (sparse user history)
  • LFM provides better reliability (parse success rate) than direct LLM prediction, particularly with Llama 2 13B
Breakthrough Assessment
7/10
Offers a significant shift in representation learning (vector to text) with strong potential for interpretability and steerability, though currently limited by the zero-shot performance compared to fully trained methods with background data.
×