Contriever: A dense information retrieval model based on continuous embeddings, pre-trained using contrastive learning
Fusion-in-Decoder (FiD): A sequence-to-sequence architecture where the encoder processes retrieved documents independently, and the decoder attends to their concatenated representations
Perplexity Distillation: A training objective where the retriever minimizes the KL-divergence between its document distribution and the language model's posterior distribution over documents
ADist: Attention Distillation—using the language model's cross-attention scores as supervision to train the retriever
EMDR2: End-to-end training of Multi-Document Reader and Retriever—an algorithm treating retrieved documents as latent variables to maximize the likelihood of the answer
MMLU: Massively Multitask Language Understanding—a benchmark covering 57 subjects like STEM, humanities, and social sciences
KILT: Knowledge-Intensive Language Tasks—a benchmark suite requiring external knowledge (Wikipedia) to solve tasks like QA and fact checking
FEVER: Fact Extraction and VERification—a dataset for fact-checking claims against evidence
query-side fine-tuning: Updating only the query encoder parameters while keeping the document encoder fixed to avoid costly index re-computation