REPLUG: Retrieve and Plug—the proposed framework for augmenting black-box LMs with retrieved documents
REPLUG LSR: REPLUG with LM-Supervised Retrieval—the training scheme where the retriever is tuned to minimize the black-box LM's perplexity
Contriever: A specific dense information retrieval model based on contrastive learning, used as the base retriever
Perplexity: A measurement of how well a probability model predicts a sample; lower is better
KL divergence: Kullback-Leibler divergence—a measure of how one probability distribution is different from a second, reference probability distribution
FAISS: Facebook AI Similarity Search—a library for efficient similarity search and clustering of dense vectors
Dual Encoder: A retrieval architecture that uses two separate encoders (often sharing weights) to embed queries and documents into the same vector space
MMLU: Massive Multi-task Language Understanding—a benchmark covering 57 tasks including STEM, humanities, and social sciences
Zero-shot/Few-shot: Evaluating a model with no (zero) or very few (few) examples in the prompt