Parametric knowledge: Facts and information stored within the model's weights during pre-training, as opposed to external context
Instruction tuning: Fine-tuning a pre-trained model on datasets formatted as instructions to improve its ability to follow user commands
AP score: Average Precision—a metric that evaluates the quality of uncertainty estimation by ranking predictions by confidence; high AP means correct answers have higher confidence than incorrect ones
Refusal-aware data: Training data modified to include explicit expressions of uncertainty (e.g., 'I am unsure') when the model's internal knowledge contradicts the ground truth
Uncertainty calibration: The alignment between a model's predicted confidence and its actual accuracy
Meta-skill: A generalized ability (here, refusal) that applies across different tasks and domains, not just the specific samples seen during training
Padding method: A data construction strategy where the original label is kept, but a certainty/uncertainty phrase is appended to it
Replacement method: A data construction strategy where the label for unknown questions is completely replaced by a refusal phrase (e.g., 'I don't know')