Evaluation Setup
Link prediction on Amazon product graphs (Electronics, Cell Phones, Grocery, Home).
Benchmarks:
- Amazon Electronics (Complementary Product Recommendation)
- Amazon Cell Phones (Complementary Product Recommendation)
- Amazon Grocery (Complementary Product Recommendation)
- Amazon Home (Complementary Product Recommendation)
Metrics:
- Hit@K (Accuracy)
- NDCG@K (Ranking Quality)
- Entropy (Diversity of token distribution)
- Vocabulary Size (Diversity)
- Statistical methodology: Not explicitly reported in the paper
Key Results
| Benchmark |
Metric |
Baseline |
This Paper |
Δ |
| Results on Amazon Cell Phones dataset showing improvement from the proposed Div.+Acc. pipeline over baselines. |
| Amazon Cell Phones |
Hit@1 |
1.154 |
1.351 |
+0.197
|
| Amazon Cell Phones |
Hit@1 |
1.087 |
1.326 |
+0.239
|
| Amazon Cell Phones |
NDCG@1 |
1.087 |
1.326 |
+0.239
|
| Amazon Cell Phones |
Vocabulary Size |
19.5 |
20.8 |
+1.3
|
| Results on Amazon Home dataset demonstrating gains in ranking quality. |
| Amazon Home |
Hit@1 |
3.383 |
3.704 |
+0.321
|
| Amazon Home |
NDCG@1 |
3.354 |
3.564 |
+0.210
|
Main Takeaways
- The Diversity Agent alone improves both accuracy and diversity metrics at lower K values (K=1), suggesting that diversifying top recommendations also helps retrieve relevant items missed by the GNN.
- The Accuracy Agent further boosts precision metrics (Hit Rate, NDCG) but consistently reduces diversity metrics (Entropy, Vocab) compared to the Diversity Agent's output, confirming the accuracy-diversity tradeoff.
- The method is robust across different underlying GNN architectures (GraphSAGE, GAT, SComGNN), delivering consistent gains without retraining the base models.