LLM-as-a-Judge: Using a strong LLM to evaluate the correctness of another model's output, acting as a proxy for human labeling
Predictive Entropy (PE): A measure of uncertainty based on the flatness of the output probability distribution; high entropy means the model is unsure which token to pick
Aleatoric Uncertainty: Uncertainty arising from inherent ambiguity or noise in the data/context
Epistemic Uncertainty: Uncertainty arising from a lack of knowledge in the model parameters
AUROC: Area Under the Receiver Operating Characteristic curve—a metric measuring how well a classifier separates positive (hallucinated) and negative (factual) classes
Token Negative Log-Likelihood (T-NLL): The negative logarithm of the probability assigned to the generated token; a direct measure of model confidence
Hidden States: The internal vector representations (embeddings) of tokens within the transformer layers, capturing semantic meaning
SQuAD: Stanford Question Answering Dataset—a reading comprehension benchmark
CNN: Convolutional Neural Network—used here to process the sequence of embeddings and capture local dependencies
MLP: Multilayer Perceptron—a simple feedforward neural network
OOD: Out-of-Distribution—evaluating the model on data types or domains it wasn't trained on