Dense retrieval: A search method using vector embeddings to find relevant items based on semantic meaning rather than exact keyword matches
BM25: A classic lexical retrieval algorithm that ranks documents based on keyword occurrence and frequency (sparse retrieval)
QLoRA: Quantized Low-Rank Adaptation—a technique to fine-tune large models efficiently by freezing most parameters and training only small adapters in low precision
Recall@k: The proportion of relevant items found in the top-k recommendations
NDCG@k: Normalized Discounted Cumulative Gain—a ranking metric that credits the model more for placing relevant items higher in the top-k list
Cold-start users: Users with very few interaction history, making it difficult for systems to learn their preferences
ID-based methods: Recommender systems that learn embeddings for specific item IDs, often failing to generalize to new items without retraining
Semantic search: Retrieval based on meaning and context (using embeddings) rather than just keyword matching
Beam search decoding: A text generation strategy that explores multiple likely output sequences (beams) simultaneously to find the best overall sequence
e5-small-v2: A specific dense embedding model used to convert text into vector representations for retrieval