← Back to Paper List

Principled Synthetic Data Enables the First Scaling Laws for LLMs in Recommendation

Benyu Zhang, Qiang Zhang, Jianpeng Cheng, Hong-You Chen, Qifei Wang, Wei Sun, Shen Li, Jia Li, Jiahao Wu, Xiangjun Fan, Hong Yan
Not reported in the paper
arXiv (2026)
Recommendation Pretraining P13N

📝 Paper Summary

LLMs for Recommendation Scaling Laws Synthetic Data Generation
The paper establishes the first power-law scaling for recommender LLMs by replacing noisy, biased user logs with a layered synthetic curriculum that decouples semantics, collaborative patterns, and sequential behavior.
Core Problem
LLMs for recommendation fail to exhibit predictable scaling laws because raw user interaction logs are sparse, noisy, and riddled with systemic biases (popularity, position, exposure bias).
Why it matters:
  • Scaling laws are essential for optimizing substantial investments in data and compute, yet none exist for recommendation CPT
  • Training on biased logs causes models to internalize and amplify system flaws rather than learning true user preferences
  • Prior attempts (e.g., PLUM) showed 'sub-scaling' where larger models (3B) failed to consistently outperform smaller ones (900M) due to data deficiencies
Concrete Example: A user clicks the first item in a list simply because it was shown first (Position Bias). If trained on this raw log, an LLM learns to recommend items based on screen position rather than user preference, reinforcing the bias. The paper replaces this with synthetic 'unbiased' walks on a user-item graph.
Key Novelty
Layered Synthetic Curriculum for Recommendation CPT
  • Deconstructs recommendation data into two clean layers: (1) item-text alignment and collaborative filtering rules to teach foundational knowledge, and (2) unbiased synthetic user interaction histories
  • Generates synthetic interaction histories via graph-based random walks that simulate user behavior without the position or popularity biases inherent in real logs
  • Discovers that this principled data enables robust power-law scaling (L = L_inf + A*D^-alpha) where raw data failed
Evaluation Highlights
  • +130% improvement on Recall@100 for SASRec trained on synthetic data vs. real data, proving synthetic patterns are more generalizable
  • Establishes first robust scaling laws for Rec-LLMs (0.6B to 8B params), with User Interaction History data showing strongest scaling (alpha approx 0.45-0.59)
  • Asymmetric transfer: Adding Collaborative Filtering data reduces asymptotic loss on User Interaction History tasks by 31% (L_inf drops from 0.95 to 0.66), while reverse transfer does not hold
Breakthrough Assessment
9/10
First successful demonstration of scaling laws in recommendation, a major open problem. The 130% gain over real data and the discovery of asymmetric data synergy are highly significant.
×