Spearman's rank correlation: A statistical measure of how well the relationship between two variables can be described using a monotonic function
Gaussian copula: A statistical model used to understand the dependency structure between variables (here, refusal and error) by transforming them to a standard normal distribution
knowledge-aware refusal: The ability of a model to refuse to answer a question specifically because it lacks the knowledge to answer correctly, avoiding both overconfidence and over-refusal
two-pass evaluation: A method where the model is queried twice: once allowing refusals (to check behavior) and once forcing an answer (to check knowledge/correctness)
AUROC: Area Under the Receiver Operating Characteristic Curve—a metric measuring a classifier's ability to distinguish between classes (here, distinguishing correct from incorrect via refusal)
SimpleQA: A dataset of short, fact-seeking questions used to evaluate LLM factuality
Prompting: The process of structuring the input text to guide the LLM's behavior, here used to adjust refusal rates