Instruction Tuning: Fine-tuning a pre-trained language model on a collection of formatted tasks (instructions) to improve its ability to follow new natural language commands.
CoT: Chain-of-Thought—a prompting strategy that encourages the model to generate intermediate reasoning steps before the final answer.
Teacher-LLM: A stronger LLM (here, GPT-3.5) used to generate synthetic training data or annotations for a smaller student model.
HR@K: Hit Ratio at K—the proportion of test cases where the target item is present in the top-K recommendations.
NDCG@K: Normalized Discounted Cumulative Gain at K—a ranking metric that accounts for the position of relevant items in the top-K list.
Implicit Preference: User preferences inferred from behavioral data (clicks, purchases) rather than stated explicitly.
Explicit Preference: User preferences stated directly in text (e.g., 'I like horror games').
SASRec: Self-Attention Based Sequential Recommendation—a baseline model using transformer encoders to capture sequential patterns in user behavior.