ID Generator: A language model that compresses item metadata into concise, unique textual tokens to serve as the item's identifier
Base Recommender: The downstream LLM that takes user history (sequences of generated IDs) and predicts the next item's ID
Diverse Beam Search (DBS): A decoding algorithm that generates multiple diverse sequences by penalizing similar outputs, used here to ensure generated IDs are unique across items
Constrained Sequence Decoding: A generation strategy where the output tokens are restricted to a valid set (prefix tree) to ensure the model generates a valid existing item ID
Zero-shot Recommendation: Making recommendations on a dataset the model has never seen during training, relying on generalizable knowledge
P5: A baseline generative recommendation model that assigns numerical indices (e.g., 'item_54') as IDs, lacking semantic meaning
OOV tokens: Out-of-Vocabulary tokens; usually referring to how P5 assigns new special tokens to items which pre-trained LLMs don't understand