LLM-as-feature-extractor: Using a Large Language Model to encode item text into static vector representations (embeddings) rather than generating text
CPT: Continued Pre-training—training an LLM on domain-specific unlabeled text to align it with the data distribution
SFT: Supervised Fine-tuning—training an LLM on labeled task data (e.g., QA pairs) to inject task-specific knowledge
SCFT: Supervised Contrastive Fine-tuning—using contrastive loss (pulling positive pairs together) to improve representation quality
MoE: Mixture-of-Experts—a neural network architecture that uses a gating mechanism to select a subset of 'expert' sub-networks for each input
PCA: Principal Component Analysis—a statistical technique for reducing the dimensionality of data while preserving variance
PQ: Product Quantization—a method to compress high-dimensional vectors by decomposing them into subspaces and quantizing them
SASRec: Self-Attentive Sequential Recommendation—a Transformer-based model for sequential recommendation
NDCG: Normalized Discounted Cumulative Gain—a ranking metric that accounts for the position of relevant items in the recommendation list
HR: Hit Ratio—the proportion of test cases where the target item appears in the top-K recommendations