Confidence Gain (CG): A metric defined as the difference in entropy between the model's parametric prediction (query only) and the RAG prediction (query + context); negative CG indicates a knowledge conflict
Memory Recall (MR): A metric measuring how often the model generates an answer based on its internal parameters rather than the retrieved context
Parametric Knowledge: Information stored in the LLM's pre-trained weights
Contextual Knowledge: Information provided in the retrieved documents or prompt
Perplexity: A measurement of how uncertain a probability model is about its predictions; calculated here using the entropy of the token distribution
ConR: Recall of Context—percentage of answers matching the retrieved information
ParR: Recall of Parameters—percentage of answers matching the model's internal knowledge
Contrastive Decoding: A decoding strategy that manipulates logits by contrasting two distributions (e.g., strong vs. weak model, or here, context-aware vs. parameter-aware)