HalluClean: The proposed zero-shot framework that uses structured reasoning prompts to detect and correct hallucinations
Zero-shot: The ability of a model to perform a task without seeing any specific training examples for that task
Chain-of-Thought (CoT): A prompting technique that encourages the model to generate intermediate reasoning steps before producing a final answer
Task-routing prompts: Minimal, high-level descriptions used to orient the model towards the specific requirements of a task (e.g., 'Summary', 'Dialogue')
HaluEval: A benchmark dataset for evaluating hallucination detection across QA, dialogue, and summarization tasks
BERTScore: A metric that calculates the similarity between two sentences using contextual embeddings, often used to measure semantic fidelity in text generation
Plan-and-Solve: A paradigm where the model first generates a plan to solve a problem and then executes it, improving reliability over direct answering
Self-contradiction: A specific type of hallucination where the model generates logically inconsistent statements within the same response
Math Word Problems (MWP): Tasks requiring the model to solve mathematical problems presented in natural language text
Retrieval-Augmented Generation (RAG): Methods that enhance model outputs by retrieving relevant documents from external databases