← Back to Paper List

NoteLLM: A Retrievable Large Language Model for Note Recommendation

Chao Zhang, Shiwei Wu, Haoxin Zhang, Tong Xu, Yan Gao, Yao Hu, Di Wu, Enhong Chen
University of Science and Technology of China & State Key Laboratory of Cognitive Intelligence, Xiaohongshu Inc.
arXiv (2024)
Recommendation P13N

📝 Paper Summary

Item-to-Item (I2I) Recommendation LLMs for Recommendation
NoteLLM jointly trains a Large Language Model to compress notes into embeddings for retrieval and generate hashtags/categories, enhancing item-to-item recommendation through multi-task learning.
Core Problem
Existing item-to-item (I2I) recommendation methods typically use BERT-based models that underutilize key conceptual cues like hashtags and categories, treating them merely as text rather than core semantic summaries.
Why it matters:
  • Hashtags and categories represent the central ideas of user-generated notes, which are crucial for determining content relatedness but are often diluted in long text
  • Standard BERT-based embeddings fail to capture the generative connection between a note's content and its condensed summary (hashtag), missing a strong signal for relevance
  • LLMs have superior language understanding but are rarely used for I2I retrieval due to the challenge of adapting generative models to produce dense vector representations for millions of items
Concrete Example: A note about a trip might mention 'Marina Bay Sands' and 'Merlion Park'. A standard model might match this to general travel posts. However, the hashtag '#Singapore' explicitly defines the core topic. If the model cannot generate or predict this hashtag, it might miss the strong connection to other notes explicitly tagged '#Singapore'.
Key Novelty
Unified Generative-Contrastive Learning for Note Compression
  • Compresses an entire note into a single virtual token (via a prompt) that serves as the note's dense embedding for retrieval
  • Jointly trains the LLM on two tasks: (1) Contrastive learning to pull embeddings of co-occurring notes closer, and (2) Generative learning to produce valid hashtags/categories from the compressed token
  • The generative task forces the compressed token to retain high-level semantic concepts (like topics), while the contrastive task injects collaborative user preference signals
Architecture
Architecture Figure Figure 2
The NoteLLM framework, detailing the prompt structure, the LLM processing, and the two training branches: Generative-Contrastive Learning (GCL) and Collaborative Supervised Fine-tuning (CSFT).
Evaluation Highlights
  • +15.1% improvement in Recall@1 over the online baseline (BERT-based) in offline experiments on the Xiaohongshu dataset
  • +12.8% improvement in AUC in online A/B testing on the Xiaohongshu platform compared to the previous production system
  • Outperforms standard sentence embedding models (e.g., SimCSE, Sentence-BERT) by significant margins on precision and recall metrics
Breakthrough Assessment
7/10
Novel application of LLMs to item-to-item recommendation via token compression. Strong industrial results (Xiaohongshu) validate the approach, though the core idea combines existing contrastive and generative techniques.
×