| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Top-5 Recommendation Results: Chat-Rec variants generally outperform baselines in Precision and NDCG, with text-davinci-003 performing best. | ||||
| MovieLens 100K (Top-5) | Precision | 0.3030 | 0.3240 | +0.0210 |
| MovieLens 100K (Top-5) | NDCG | 0.3425 | 0.3802 | +0.0377 |
| MovieLens 100K (Top-5) | Recall | 0.1455 | 0.1404 | -0.0051 |
| Rating Prediction Results: Chat-Rec significantly outperforms baselines in predicting explicit ratings. | ||||
| MovieLens 100K (Rating Prediction) | RMSE | 0.933 | 0.785 | -0.148 |
| MovieLens 100K (Rating Prediction) | MAE | 0.734 | 0.593 | -0.141 |
| Ablation on Prompt Construction: Removing the recommender system's top-1 item from the prompt background severely hurts performance. | ||||
| MovieLens 100K (Top-5) | NDCG | 0.3802 | 0.3055 | -0.0747 |