RAG: Retrieval-Augmented Generation—AI systems that answer questions by searching for relevant documents
single-hop QA: Questions where the answer can be found in a single document or reasoning step
multi-hop QA: Questions requiring reasoning across multiple documents (e.g., bridging entity A to B to C)
Iter-Retgen: Iterative Retrieval-Generation—a method that interleaves generation and retrieval steps multiple times
silver labels: Training labels generated automatically (e.g., by checking which model answers correctly) rather than by humans
inductive bias: Assumptions built into a learning algorithm or data (e.g., assuming all questions in a 'multi-hop' dataset are complex)
FLAN-T5: A family of instruction-tuned language models based on the T5 architecture
Retriever: A module (like DPR or Contriever) that finds relevant documents from a large corpus
Reader: The LLM component that processes the query and retrieved documents to generate an answer