_comment: REQUIRED: Define ALL technical terms, acronyms, and method names used ANYWHERE in the entire summary. After drafting the summary, perform a MANDATORY POST-DRAFT SCAN: check every section individually (Core.one_sentence_thesis, evaluation_highlights, core_problem, Technical_details, Experiments.key_results notes, Figures descriptions and key_insights). HIGH-VISIBILITY RULE: Terms appearing in one_sentence_thesis, evaluation_highlights, or figure key_insights MUST be defined—these are the first things readers see. COMMONLY MISSED: PPO, DPO, MARL, dense retrieval, silver labels, cosine schedule, clipped surrogate objective, Top-k, greedy decoding, beam search, logit, ViT, CLIP, Pareto improvement, BLEU, ROUGE, perplexity, attention heads, parameter sharing, warm start, convex combination, sawtooth profile, length-normalized attention ratio, NTP. If in doubt, define it.
Semantic Entropy (SE): An uncertainty measure that clusters multiple model generations by meaning (using NLI) and calculates entropy over these semantic clusters
SEP: Semantic Entropy Probe—a linear classifier trained on LLM hidden states to predict the Semantic Entropy value
Linear Probe: A simple linear classifier (e.g., logistic regression) trained on the fixed features (hidden states) of a pre-trained model
NLI: Natural Language Inference—a task determining if one text entails (logically implies) another
DeBERTa: Decoding-enhanced BERT with disentangled attention—a transformer model often used for NLI tasks
AUROC: Area Under the Receiver Operating Characteristic curve—a performance metric for binary classification problems
TBG: Token Before Generation—the hidden state at the last token of the input query
SLT: Second Last Token—the hidden state at the last token of the model response (before EOS)
Hallucination: Plausible-sounding but factually incorrect or arbitrary generation by an LLM