User Profile: A collection of a user's historical data, specifically past inputs and personalized outputs they produced or approved
IPA: In-Prompt Augmentation—a method where retrieved user history items are directly prepended to the input text within the LLM's context window
FiD: Fusion-in-Decoder—an architecture where the encoder processes multiple retrieved passages independently, and the decoder aggregates their representations to generate the output
LaMP: Language Model Personalization—the name of the benchmark introduced in this paper
BM25: A ranking function used in information retrieval to estimate the relevance of documents to a given search query based on term frequency
Contriever: A dense retrieval model that encodes queries and documents into vector embeddings to find semantically similar items
ROUGE: Recall-Oriented Understudy for Gisting Evaluation—a set of metrics used to evaluate automatic summarization and machine translation by comparing them to reference summaries
MAE: Mean Absolute Error—a measure of errors between paired observations expressing the same phenomenon
RMSE: Root Mean Square Error—a standard way to measure the error of a model in predicting quantitative data
Zero-shot: Evaluating a model on a task without providing any specific training examples for that task in the prompt or through fine-tuning