Cold-start: The problem of recommending items to users with very few historical interactions, making pattern matching difficult
Long-tail: Items that are rarely interacted with, leading to sparse training data and poor representation learning
Knowledge Distillation: Training a smaller 'student' model to reproduce the behavior or predictions of a larger 'teacher' model (here, an LLM)
Pairwise Ranking Loss: A loss function that optimizes the relative ordering of item pairs (item A > item B) rather than absolute scores
AUC: Area Under the ROC Curve—a metric measuring the probability that a random positive item is ranked higher than a random negative item
Gating Mechanism: A neural network component that outputs a scalar (usually 0 to 1) to control how much information flows through a specific path
Hallucination: When an LLM generates plausible-sounding but factually incorrect or nonsensical information
LLM: Large Language Model—a massive AI model trained on text that can perform reasoning and generation tasks
Entropy: A measure of uncertainty in a probability distribution; high entropy implies the model is unsure of its prediction