Selective Retrieval: A RAG strategy where the system decides for each query whether to retrieve external documents or rely on the model's internal knowledge
Parametric Knowledge: Information stored implicitly in the weights (parameters) of a neural network during pre-training
Knowledge Verbalization: The process of explicitly generating (writing out) relevant internal knowledge as text before answering a question
GenRead: A method (Generate-then-Read) where an LLM generates context documents based on a query instead of retrieving them
DPO: Direct Preference Optimization—a method to align language models to preferences without a separate reward model
Policy Datastore: A memory bank storing query representations and their preferred routing labels, used at test time to guide decision-making via kNN
kNN: k-Nearest Neighbors—an algorithm that classifies a new data point based on the majority class of its 'k' closest examples in the training set
SR-RAG: Self-Routing RAG—the proposed framework unifying routing and generation
Behavior Cloning: Supervised learning where a model learns to mimic the actions of an expert (or oracle) policy