| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Sequence Residual Task: Accuracy of LLaMA identifying the 'residual' (newly added) item by comparing hidden states t and t+1. | ||||
| MovieLens | Accuracy | 52.63 | 52.63 | 0.00 |
| MovieLens | Accuracy | Not applicable | 97.89 | Not applicable |
| Steam | Accuracy | Not applicable | 86.33 | Not applicable |
| Oracle Instantiation: Using RecInterpreter to 'decode' DreamRec's ideal latent vector into text, then comparing preference vs SASRec via ChatGPT. | ||||
| MovieLens | ChatGPT Preference | 35.79 | 50.53 | +14.74 |