← Back to Paper List

Keyword-driven Retrieval-Augmented Large Language Models for Cold-start User Recommendations

Hai-Dang Kieu, Minh Duc Nguyen, Thanh-Son Nguyen, Dung D. Le
VinUniversity, Institute of High Performance Computing, Agency for Science, Technology and Research
arXiv (2024)
RAG Recommendation P13N

📝 Paper Summary

Recommender Systems Cold-start Recommendation
KALM4Rec addresses cold-start recommendation by asking new users for keywords, retrieving candidates via a keyword-item graph, and re-ranking them using LLMs prompted with keyword profiles.
Core Problem
Traditional collaborative filtering fails for new (cold-start) users due to lack of interaction history, while LLMs struggle with token limits and hallucinations when processing full review text.
Why it matters:
  • Cold-start users are critical for platform growth but difficult to retain without personalized suggestions
  • Directly feeding user/item history into LLMs is token-expensive and prone to exceeding context windows
  • User reviews contain noise; extracting keywords captures preference essence more efficiently than full text
Concrete Example: A new user joins Yelp without history. Standard CF cannot recommend anything personalized. Asking them for keywords like 'sushi' and 'quiet' allows KALM4Rec to retrieve relevant spots, whereas a standard LLM prompted with 'recommend a restaurant' might hallucinate non-existent places.
Key Novelty
Keyword-driven Retrieval-Augmented Large Language Models for Cold-start User Recommendations (KALM4Rec)
  • Uses explicit keyword sets (e.g., 'sushi', 'quiet') instead of full reviews or dense vectors to represent user preferences and item characteristics, reducing token usage
  • Retrieves candidates via a graph (Message Passing on Graph) connecting keywords to items without requiring deep learning training parameters
  • Re-ranks items using LLMs prompted with these concise keyword profiles and few-shot examples
Architecture
Architecture Figure Figure 1
The KALM4Rec framework showing the two-stage process: Candidate Retrieval via graph and Candidate Re-ranking via LLM.
Evaluation Highlights
  • Outperforms retrieval baselines (CLCRec, MVAE) on Yelp and TripAdvisor datasets in Recall@20 and Precision@20
  • LLM re-ranking (Gemini Pro 1.5) improves Recall@3 over retrieval-only methods, showing LLMs effectively refine keyword-based candidate lists
  • 3-shot prompting consistently outperforms zero-shot and 1-shot strategies for the re-ranking task
Breakthrough Assessment
5/10
A practical application of LLMs to cold-start recommendation. The use of keywords to bridge user intent and LLM context is clever but the graph method is relatively simple and the scale is moderate.
×