Factuality Hallucination: Generated content that contradicts verifiable real-world facts (e.g., wrong dates, fabricated events)
Faithfulness Hallucination: Generated content that diverges from user instructions, provided context, or internal logical consistency, regardless of real-world factual correctness
Intrinsic Hallucination: Generated output that directly contradicts the provided source content (traditional NLG definition)
Extrinsic Hallucination: Generated output that cannot be verified from the source content (traditional NLG definition)
SFT: Supervised Fine-Tuning—training the model on labeled (instruction, response) pairs to learn to follow instructions
RLHF: Reinforcement Learning from Human Feedback—aligning the model with human preferences using a reward model and reinforcement learning
RAG: Retrieval-Augmented Generation—enhancing model generation by retrieving relevant external documents to ground the response
Softmax Bottleneck: A theoretical limitation in the final layer of language models that restricts their ability to model complex probability distributions, potentially leading to hallucinations
Entity-error hallucination: A subtype of factual contradiction where the model generates erroneous entities (e.g., wrong inventor name)
Relation-error hallucination: A subtype of factual contradiction where the model asserts incorrect relationships between entities
Unverifiability hallucination: A subtype of factual fabrication where the statement is entirely non-existent or impossible to verify
Overclaim hallucination: A subtype of factual fabrication where the model presents subjective or controversial opinions as universally valid facts