← Back to Paper List

Netflix Artwork Personalization via LLM Post-training

Hyunji Nam, Sejoon Oh, Emma Kong, Yesu Feng, Moumita Bhattacharya
Netflix
arXiv (2026)
P13N Recommendation RL MM

📝 Paper Summary

Recommendation Systems Visual-Language Personalization
The paper adapts large language models to personalize movie artwork recommendations by representing user history and visual options as text, then post-training via supervised fine-tuning with reasoning and direct preference optimization.
Core Problem
Standard recommendation systems often use a 'one-size-fits-all' image for a title, failing to appeal to diverse user tastes (e.g., users preferring romance vs. action) even when the title itself is personalized.
Why it matters:
  • Artworks are critical decision cues for users deciding to watch or skip content
  • Diverse user bases have heterogeneous preferences that a single image cannot satisfy (e.g., cultural or genre-specific preferences)
  • Existing LLM recommendation work focuses on title selection but neglects the visual presentation layer (artwork personalization)
Concrete Example: A movie might have both intense action and romantic subplots. A user who loves romance might skip the movie if shown an action-heavy artwork, whereas they would watch it if shown an image emphasizing the characters' relationship. The proposed system predicts this preference.
Key Novelty
Text-Based Visual Personalization via LLM Post-Training
  • Converts the visual personalization problem into a text-based multiple choice task by captioning artwork images and summarizing user history
  • Uses 'Prediction with Reasoning' where a larger teacher model (Qwen) generates justifications for ground-truth choices to supervise a smaller student model (Llama)
  • Applies Direct Preference Optimization (DPO) to explicitly teach the model to rank the successful artwork higher than rejected alternatives
Evaluation Highlights
  • +5% improvement in Inverse Propensity Score (IPS) over the Netflix production model using SFT with reasoning
  • +3% improvement in IPS over the Netflix production model using Direct Preference Optimization (DPO)
  • Zero-shot Llama 3.1 8B performs significantly better than random guessing, demonstrating inherent world knowledge utility
Breakthrough Assessment
7/10
Novel application of LLMs to the specific industrial problem of artwork personalization. Demonstrates successful transfer of reasoning capabilities to visual preference tasks, though the scope is specific to one platform's dataset.
×