RAG: Retrieval-Augmented Generation—AI systems that answer questions by searching for documents before generating responses
FLARE: Forward-Looking Active REtrieval augmented generation—the proposed method that uses hypothetical future sentences to guide retrieval
hallucination: When a language model generates factually incorrect or nonsensical information confidently
single-time retrieval: Standard RAG setup where documents are retrieved once based on the user input before generation starts
passive retrieval: Retrieving information at fixed intervals (e.g., every k tokens) regardless of whether the model needs it
active retrieval: The system dynamically decides when to retrieve based on specific criteria (e.g., low model confidence)
chain-of-thought: A prompting technique where the model generates intermediate reasoning steps before the final answer
BM25: Best Matching 25—a standard ranking function used by search engines to estimate the relevance of documents to a query
EM: Exact Match—a metric checking if the generated answer text matches the ground truth exactly
ROUGE: Recall-Oriented Understudy for Gisting Evaluation—a set of metrics used to evaluate automatic summarization by comparing to human summaries
UniEval: A metric for evaluating text generation quality, focusing here on factual consistency
zero-shot: The model performs a task without seeing any specific training examples for that task
few-shot: The model is given a small number of examples (shots) in the prompt to understand the task