← Back to Paper List

Large Language Models are Zero-Shot Rankers for Recommender Systems

Yupeng Hou, Junjie Zhang, Zihan Lin, Hongyu Lu, Ruobing Xie, Julian McAuley, Wayne Xin Zhao
arXiv (2023)
Recommendation P13N Benchmark

πŸ“ Paper Summary

LLMs for Recommendation Zero-Shot Ranking
This paper establishes that LLMs can act as effective zero-shot rankers by formalizing recommendation as a conditional ranking task, though they require specific prompting strategies to handle position bias and history perception.
Core Problem
Traditional recommender systems are 'narrow experts' that lack common sense and struggle with complex user intents, while LLMs have potential but suffer from high computational costs and unknown behavioral biases in ranking tasks.
Why it matters:
  • Capturing user preferences solely from clicked ID sequences limits the expressive power for modeling explicit user interests
  • Existing transfer learning methods still require fine-tuning, making them less capable of solving diverse recommendation tasks in a zero-shot manner
  • Insufficient understanding of LLM characteristics (like order perception and biases) hinders their deployment in the ranking stage of recommendation pipelines
Concrete Example: When given a user's movie history, an LLM might fail to prioritize the most recent interests if the history is just listed sequentially, or it might incorrectly prefer a movie simply because it appears earlier in the candidate list (position bias).
Key Novelty
LLMs as Conditional Rankers with Bias Mitigation
  • Formalizes recommendation as a conditional ranking task where interaction history acts as the 'condition' and retrieved items are 'candidates' to be sorted
  • Identifies that LLMs struggle to perceive sequential order in history and proposes 'recency-focused' prompting to fix this
  • Introduces a bootstrapping strategy (repeated ranking with shuffled candidates) to statistically alleviate the model's inherent position bias
Architecture
Architecture Figure Figure 1
The overall framework of the LLM-based ranking approach.
Evaluation Highlights
  • LLMs outperform existing zero-shot baselines (UniSRec, VQ-Rec) on MovieLens-1M and Amazon Games datasets
  • The proposed LLM ranker surpasses trained conventional baselines (Pop, BPRMF) when ranking candidates retrieved by multiple generators
  • Bootstrapping (repeated ranking) consistently improves performance by mitigating position bias
Breakthrough Assessment
7/10
Provides a solid empirical foundation for LLM-based ranking, identifying critical biases and offering practical fixes (bootstrapping). It shifts the paradigm from 'LLM as Recommender' to 'LLM as Ranker'.
×