RAG: Retrieval-Augmented Generation—AI systems that answer questions by first searching for relevant documents
LaMP: Language Model Personalization benchmark—a dataset containing various user-centric NLP tasks like citation prediction and news categorization
BM25: Best Matching 25—a probabilistic information retrieval algorithm used to rank documents based on query terms
cold-start problem: The difficulty of providing personalized recommendations or results for new users who lack sufficient history
ROUGE: Recall-Oriented Understudy for Gisting Evaluation—a set of metrics used to evaluate automatic summarization and machine translation
MAE: Mean Absolute Error—a measure of errors between paired observations expressing the same phenomenon
RMSE: Root Mean Square Error—a standard way to measure the error of a model in predicting quantitative data
instruction-tuned: Language models fine-tuned on datasets of instructions to better follow user commands
offline inference: Generating outputs (like summaries) beforehand and storing them, rather than generating them during the user interaction