LLM4MSR leverages a frozen LLM to reason about user and scenario semantics, then uses hierarchical meta-networks to generate adaptive weights that enhance a multi-scenario recommendation backbone.
Core Problem
Existing Multi-Scenario Recommendation (MSR) methods rely heavily on simple domain indicators and collaborative signals, ignoring rich semantic scenario knowledge and personalized cross-scenario preferences.
Why it matters:
Insufficient scenario knowledge (e.g., relying only on ID) leads to poor correlation modeling between diverse business domains
Directly deploying LLMs in industrial systems is hindered by high inference latency and tuning costs
Current methods fail to disentangle and explicitly model users' personalized interests across different scenarios
Concrete Example:In an app with 'search' and 'recommendation' scenarios, standard models distinguish them only by a domain ID. They fail to understand that a user's positive interaction with 'electronics' in 'search' semantically implies a specific interest that should transfer to 'recommendation' differently than a random click.
Key Novelty
LLM-Driven Hierarchical Meta-Network Injection
Uses a frozen LLM not as a feature extractor or ranker, but as a 'reasoner' that outputs a high-dimensional hidden state encapsulating scenario and user semantics
This hidden state drives 'meta-networks' that dynamically generate the weights and biases (meta layers) for the recommendation backbone, effectively modulating the backbone with semantic knowledge
Adopts a hierarchical structure where user-level knowledge modulates bottom layers and scenario-level knowledge modulates parallel layers
Architecture
The overall architecture of LLM4MSR, detailing the prompt construction, LLM reasoning, and hierarchical meta-network injection into the backbone.
Breakthrough Assessment
8/10
Proposes a novel paradigm of using LLMs to generate parameters (meta-learning) rather than just features or text, solving the efficiency bottleneck while injecting semantic intelligence.
Trainable Parameters: Meta Networks (MLPs), Backbone Parameters (Embeddings, Towers)
Key Hyperparameters:
llm_hidden_dim: 4096 (for ChatGLM2-6B)
meta_layer_structure: User-level at bottom, Scenario-level in parallel
Compute: High inference latency of LLM is mitigated by freezing it (allowing caching/offline inference) or using it only to generate meta-parameters rather than per-request processing
Comparison to Prior Work
vs. STAR/MMoE: Explicitly incorporates semantic scenario knowledge via LLM reasoning rather than just collaborative signals
vs. CTRL/KAR: Uses LLM to generate *parameters* (meta-learning) rather than just input features/embeddings, and uses a generative LLM (ChatGLM) rather than a PLM (BERT)
vs. Fine-tuning LLMs: Keeps LLM frozen to ensure efficiency and deployability in industrial systems
Limitations
Dependency on the quality of the frozen LLM's reasoning capabilities
Increased model complexity due to addition of meta networks
Inference latency concerns if LLM inference is not cached or processed offline (though paper claims efficiency)
Code and data available at provided GitHub links. Uses public datasets (KuaiSAR, Amazon). Prompts templates provided in Appendix.
📊 Experiments & Results
Evaluation Setup
CTR Prediction on multi-scenario datasets
Benchmarks:
KuaiSAR-small (Multi-scenario CTR prediction)
KuaiSAR (Multi-scenario CTR prediction)
Amazon (Multi-scenario CTR prediction)
Metrics:
Logloss
AUC
Statistical methodology: Not explicitly reported in the provided text
Main Takeaways
LLM4MSR effectively enhances various multi-scenario backbones (like STAR, PLE) by injecting semantic knowledge.
The hierarchical meta-network structure (user-level bottom + scenario-level parallel) is empirically the most effective configuration.
The approach is efficient for industrial deployment because the LLM is frozen and does not require expensive fine-tuning or real-time high-latency inference for every request (knowledge can be cached or computed efficiently).
Provides better interpretability via the LLM's ability to output natural language reasoning alongside the vector representations used for recommendation.