← Back to Paper List

Adapting Large Language Models by Integrating Collaborative Semantics for Recommendation

Bowen Zheng, Yupeng Hou, Hongyu Lu, Yu Chen, Wayne Xin Zhao, Ming Chen, Ji-Rong Wen
Renmin University of China, University of California San Diego, WeChat, Tencent
arXiv (2023)
Recommendation P13N

📝 Paper Summary

LLM for Recommendation Generative Recommendation Sequential Recommendation
LC-Rec bridges the gap between language and recommendation by using vector-quantized item indices and multi-task alignment tuning to integrate collaborative semantics directly into the LLM's generative process.
Core Problem
There is a large semantic gap between the language semantics captured by LLMs and the collaborative semantics (item IDs) used in recommender systems.
Why it matters:
  • Existing LLM-based recommenders often rely on text-only inputs, which ignore collaborative signals (user behavior patterns) crucial for accuracy.
  • Simple fine-tuning on ID sequences treats IDs as OOV tokens without meaningful semantic grounding, limiting the LLM's ability to generalize.
  • Many current LLM recommenders cannot handle full-ranking scenarios (generating items from the entire catalog) and rely on pre-filtering candidates.
Concrete Example: A user who buys 'Legend of Zelda' might next buy a specific 'Nintendo Switch Case'. An LLM relying only on text might suggest generic 'Zelda' merchandise, while collaborative signals know this specific sequence implies a hardware accessory purchase. Text-only models miss this latent behavioral link.
Key Novelty
LC-Rec (Language and Collaborative semantics for Recommendation)
  • Uses Residual-Quantized Variational AutoEncoder (RQ-VAE) to create discrete item indices based on text embeddings, ensuring IDs capture content similarity.
  • Introduces 'Uniform Semantic Mapping' via Sinkhorn-Knopp to prevent index collisions (multiple items sharing the same ID) while maintaining semantic structure.
  • Fine-tunes the LLM with asymmetric alignment tasks (e.g., predicting item titles from index sequences, inferring user intent from indices) to deeply fuse language and collaborative knowledge.
Architecture
Architecture Figure Figure 1
The LC-Rec framework, illustrating the two-stage process: Item Indexing via VQ-VAE and Alignment Tuning via multi-task instructions.
Evaluation Highlights
  • +68.6% improvement in HR@1 on the Games dataset compared to the best baseline (P5-CID) by effectively integrating collaborative signals.
  • Achieves average performance improvement of 25.5% in full ranking evaluations across three Amazon datasets compared to baselines.
  • Outperforms text-based LLM methods (TALLRec, InstructRec) and ID-based generative methods (TIGER, P5) consistently on HR and NDCG metrics.
Breakthrough Assessment
8/10
Significantly outperforms strong baselines by solving the ID collision problem in VQ-based recommendation and proposing a robust alignment strategy for LLMs. Effectively bridges the text-ID gap.
×