← Back to Paper List

Can Small Language Models be Good Reasoners for Sequential Recommendation?

Yuling Wang, Changxin Tian, Binbin Hu, Yanhua Yu, Ziqi Liu, Zhiqiang Zhang, Jun Zhou, Liang Pang, Xiao Wang
Beijing University of Posts and Telecommunications, Ant Group, Institute of Computing Technology, Chinese Academy of Sciences, Beihang University
arXiv (2024)
Recommendation Reasoning P13N

📝 Paper Summary

Sequential Recommendation LLM for Recommendation Knowledge Distillation
SLIM distills step-by-step reasoning capabilities from a large teacher LLM into a smaller student model, generating dense rationale vectors that enhance traditional sequential recommenders.
Core Problem
Directly using Large Language Models (LLMs) for sequential recommendation is computationally expensive and high-latency, while traditional models lack the open-world reasoning capabilities to understand complex user behaviors.
Why it matters:
  • Real-world recommender systems require low-latency inference, making massive models like GPT-4 impractical to deploy directly for every user request
  • Traditional sequential models suffer from closed-loop data limitations (exposure bias), missing the broader context and reasoning ability inherent in LLMs
  • Existing methods that use LLMs as rankers or knowledge enhancers often ignore the intermediate reasoning steps that explain *why* a user might prefer an item
Concrete Example: A user's history might include various strategy games. A traditional model sees only IDs. A large LLM can reason 'User likes strategy -> recommend Civilization VI', but costs too much. SLIM distills this reasoning process so a small model can output 'User enjoys historical strategy games' as a dense vector to guide the recommender.
Key Novelty
Step-by-step knowLedge dIstillation fraMework (SLIM)
  • Uses Chain-of-Thought (CoT) prompting on a teacher LLM to generate macro-to-micro rationales (User Preference -> Category Interest -> Specific Items)
  • Distills this reasoning process into a smaller student model (LLaMA2-7B) by using the teacher's rationales as training labels, enabling the student to 'think' like the teacher
  • Encodes the student's generated text rationales into dense vectors that are fused with traditional ID-based or ID-agnostic recommendation backbones
Evaluation Highlights
  • Outperforms state-of-the-art baselines on three real-world datasets (Amazon Beauty, Sports, Toys), with significant gains in ID-agnostic settings
  • Student model (LLaMA2-7B) achieves reasoning capabilities comparable to teacher models 25x its size while using only 4% of the parameters
  • Generates meaningful natural language rationales that improve interpretability without the high inference cost of massive LLMs
Breakthrough Assessment
7/10
Novel application of CoT distillation specifically for sequential recommendation. Effectively bridges the gap between high-reasoning LLMs and efficiency-focused recommender systems, though the core technique is a standard distillation application.
×