Chen hao, Xie Runfeng, Cui Xiangyang, Yan Zhou, Wang Xin, Xuan Zhanwei, Zhang Kai
State Key Laboratory of Communication Content Cognition, People’s Daily Online, Beijing, China
arXiv
(2023)
RecommendationKGP13N
📝 Paper Summary
News RecommendationKnowledge Graphs (KG)Large Language Models (LLM)
LKPNR improves news recommendation by combining traditional encoders with LLMs for deep semantic understanding and Knowledge Graphs for collaborative entity connections, addressing sparse data issues for inactive users.
Core Problem
Traditional news recommendation models struggle with complex semantic understanding of news text and fail to effectively recommend for inactive users (the 'long tail problem') due to insufficient historical data.
Why it matters:
Existing methods rely heavily on rich historical behaviors, leaving inactive users with poor recommendations
Traditional text encoders (CNN/LSTM) often miss complex semantic nuances and external knowledge connections in news articles
The 'long tail problem' means a vast majority of less popular news items are rarely recommended, reducing diversity and system effectiveness
Concrete Example:A user clicks a few news items about specific entities (e.g., 'Warriors'). A traditional model might fail to recommend a relevant but less popular article about 'D'Angelo Russell' if the user hasn't clicked it before. LKPNR uses the KG to link 'Warriors' to 'D'Angelo Russell' (a team member) and the LLM to understand the trade context, surfacing the relevant article even without direct click history.
Key Novelty
LLM and KG Augmented Personalized News Recommendation (LKPNR)
Augments standard news encoders by running news text through an LLM to extract deep semantic representations (hidden states)
Constructs a Knowledge Graph subgraph for entities in the news, using multi-hop neighbors to capture latent connections between seemingly unrelated news items
Fuses three distinct representations: the general encoder's output, the LLM's semantic vector, and the KG's structural entity vector
Architecture
The complete LKPNR framework. Bottom left shows input news. Center shows the three encoders (General, KG, LLM) producing vectors r_GNE, r_KG, r_LLM. Right side shows user history encoding and click prediction.
Evaluation Highlights
+2.47% AUC improvement over the NRMS baseline on the MIND dataset
+2.25% nDCG@5 improvement over the NRMS baseline
ChatGLM2-6B outperforms larger models like LLAMA2-13B in this framework, likely due to better alignment with the data distribution
Breakthrough Assessment
6/10
Solid integration of two trending technologies (LLM + KG) into established baselines with clear empirical gains. While the architecture is a logical extension rather than a paradigm shift, it effectively addresses the specific problem of semantic sparsity.
⚙️ Technical Details
Problem Definition
Setting: Personalized News Recommendation: predicting click probability given user history and candidate news
Inputs: User's click history H (sequence of news) and a candidate news article n
Outputs: Click probability score (matching score between user representation and news representation)
Pipeline Flow
Input Processing: Extract title, abstract, and entities from news
Parallel Encoding: Process news through three parallel encoders (General, LLM-Augmented, KG-Augmented)
Fusion: Concatenate representations to form final news vector
User Encoding: Aggregate history of news vectors into user vector
Prediction: Compute dot product between user and candidate news vectors
System Modules
General News Encoder (News Encoding)
Learn basic word-level semantic representations using traditional methods (e.g., Attention, CNN)
Model or implementation: Based on NRMS or NAML architecture
LLM-Augmented Encoder (News Encoding)
Extract deep semantic features using a pre-trained LLM
Model or implementation: ChatGLM2-6B (default), LLAMA2-13B, or RWKV-7B
KG-Augmented Encoder (News Encoding)
Capture structural entity relationships via multi-hop graph traversal
Model or implementation: Custom attention-based graph aggregator
LK-Aug User Encoder
Aggregate sequence of browsed news representations into a user profile
Model or implementation: Attention-based sequence aggregator (from NRMS/NAML)
Novel Architectural Elements
Triple-path news encoding: fusing General, LLM, and KG representations
LLM-to-KG bridge: using the LLM's text representation to generate the 'query' vector for the KG attention mechanism, rather than the general encoder's output
Modeling
Base Model: Integrates with NRMS or NAML as the 'General' backbone; uses ChatGLM2-6B, LLAMA2-13B, or RWKV-7B as the LLM component
Training Method: Supervised learning on click logs (negative sampling)
Objective Functions:
Purpose: Maximize likelihood of clicked news over non-clicked news.
Formally: Negative Log Likelihood Loss L = - sum(log(p_i)) over positive samples.
vs. Liu et al. (2023) [Generative News Rec]: LKPNR focuses on representation learning fusion rather than generating user profiles or data augmentation [not cited in paper]
Limitations
Computational cost of running LLM inference for every news item is high (though caching is possible)
Dependency on external Knowledge Graph quality and coverage
Performance varies significantly with choice of LLM (ChatGLM2 > LLAMA2 > RWKV in their tests)
Code is publicly available on GitHub. MIND dataset is public. Specific hyperparameters (LR, batch size) are provided. KG source (Wiki KG) is implied but specific version/dump not detailed.
📊 Experiments & Results
Evaluation Setup
Offline evaluation on historical click logs
Benchmarks:
MIND (Microsoft News Dataset) (News Recommendation)
Metrics:
AUC (Area Under ROC)
MRR (Mean Reciprocal Rank)
nDCG@5
nDCG@10
Statistical methodology: Not explicitly reported in the paper
Key Results
Benchmark
Metric
Baseline
This Paper
Δ
MIND (Sampled)
AUC
0.6802
0.7049
+0.0247
MIND (Sampled)
nDCG@5
0.3661
0.3886
+0.0225
MIND (Sampled)
AUC
0.6845
0.7023
+0.0178
MIND (Sampled)
AUC
0.7049
0.6997
-0.0052
MIND (Sampled)
AUC
0.7049
0.6842
-0.0207
Experiment Figures
Comparison of query strategies for the KG attention mechanism (General Encoder Query vs. LLM Query) across different numbers of attention heads
Visualization of attention weights on KG neighbors for a specific case study
Main Takeaways
Integrating both LLM and KG significantly outperforms traditional baselines (NRMS, NAML)
LLM contribution is more substantial than KG contribution (larger drop when removed), but both are additive
Using the LLM representation to query the KG works better than using the general encoder representation, suggesting LLMs capture better semantic 'keys' for knowledge retrieval
ChatGLM2-6B performed best among LLMs tested, likely due to bilingual (Chinese/English) training data covering more news context
Large Language Models (LLMs) and hidden state extraction
Attention mechanisms
Key Terms
MIND: A large-scale dataset for news recommendation constructed from MSN news logs
Long tail problem: The phenomenon where a small number of popular items get most attention, while the vast majority of items (the tail) are rarely recommended
Knowledge Graph (KG): A structured representation of knowledge using entities (nodes) and relations (edges), used here to link news entities
Hop: A step in a graph traversal; 1-hop neighbors are directly connected, 2-hop are connected via one intermediary
AUC: Area Under the ROC Curve—a metric measuring the model's ability to distinguish between positive (clicked) and negative (non-clicked) samples
MRR: Mean Reciprocal Rank—a metric evaluating how high the first relevant item appears in the recommendation list
nDCG: Normalized Discounted Cumulative Gain—a ranking metric that gives more credit to relevant items appearing higher in the list