RAG: Retrieval-Augmented Generation—systems that improve LLM responses by retrieving relevant external documents
Knowledge Graph (KG): A structured representation of data as a graph where nodes are entities and edges are relationships
Scaffold: A guiding structure or constraint used to control the generation process of an LLM
Factual Scaffold: Using human-written citing sentences as the immutable basis for fact generation
Algorithmic Scaffold: Using a knowledge graph structure to programmatically dictate question type and complexity
nDCG@k: Normalized Discounted Cumulative Gain—a measure of ranking quality that considers position of relevant items
Closed-book setting: Asking the model to answer questions using only its internal training data, without external retrieval
False-premise: A question based on an incorrect assumption (e.g., 'When did the US President visit Mars?')
qrels: Query relevance judgments—annotations indicating which documents are relevant to a specific query