RAG: Retrieval-Augmented Generation—an architecture where a generator conditions on documents retrieved by a neural retriever
FiD: Fusion-in-Decoder—a method where the encoder processes documents independently, and the decoder attends to their concatenated representations
DPR: Dense Passage Retriever—a bi-encoder system using dot-product similarity between query and document vectors for retrieval
Poly-encoder: An architecture allowing late interaction between context and candidates using learned attention codes, balancing speed and expressiveness
Knowledge F1: A metric measuring unigram overlap between the model's generation and the ground-truth knowledge snippet (not just the reference response)
Rare F1: F1 score calculated only on words in the lower half of the dataset's cumulative frequency distribution to penalize safe, common responses
FAISS: Facebook AI Similarity Search—a library for efficient similarity search and clustering of dense vectors
RAG-Turn: A proposed variant where retrieval is performed for individual dialogue turns before joint marginalization, rather than just using the concatenated context
hallucination: The generation of text that is grammatically plausible but factually incorrect or nonsensical relative to the source/reality