← Back to Paper List

Bridging the Information Gap Between Domain-Specific Model and General LLM for Personalized Recommendation

Wenxuan Zhang, Hongzhi Liu, Yingpeng Du, Chen Zhu, Yang Song, Hengshu Zhu, Zhonghai Wu
Peking University, BOSS Zhipin
arXiv (2023)
Recommendation P13N Memory

📝 Paper Summary

LLM-based Recommendation Collaborative Filtering Hybrid Recommendation Systems
BDLM aligns Large Language Models with domain-specific recommendation models via a shared embedding module and mutual learning, allowing the LLM to access community behavior patterns while enhancing the domain model with semantic knowledge.
Core Problem
General LLMs lack access to collaborative signals (community behavior patterns) critical for recommendation, while domain-specific models struggle with data sparsity due to a lack of general knowledge.
Why it matters:
  • LLMs struggle to distinguish similar items or capture latent community trends purely from text prompts
  • Domain-specific models (like Matrix Factorization) fail when interaction data is sparse because they cannot leverage semantic item content
  • Existing methods that translate interaction history into text prompts ('text-is-all-you-need') lose structural graph information
Concrete Example: In e-commerce, two shoes might have nearly identical text descriptions (e.g., 'Black Leather Shoes'), making them indistinguishable to an LLM. However, a domain model knows distinct user groups buy them. Conversely, a domain model fails on a new item with no clicks, whereas an LLM can infer its appeal from its description.
Key Novelty
Bridge Domain-specific and LLM models (BDLM)
  • Introduces task-specific tokens (<uid>, <iid>) into the LLM's vocabulary, initialized with embeddings from a domain-specific model (like LightGCN) to transfer behavioral patterns
  • Uses a deep mutual learning strategy where the LLM and domain model iteratively update a shared 'information sharing module' to align their representations in the same latent space
Evaluation Highlights
  • +22.7% HR@1 improvement on MovieLens-1M compared to state-of-the-art domain model (LightGCN)
  • +29.8% HR@1 improvement on Amazon-Grocery compared to LightGCN
  • Significantly outperforms text-only LLM baselines (InstructRec) across all datasets, particularly in e-commerce domains where text descriptions are less discriminative than movie titles
Breakthrough Assessment
7/10
Strong conceptual contribution in bridging the 'modality gap' between ID-based and text-based recommendation. The mutual learning approach is effective, though the architecture relies on standard components.
×