Dialogue CoT: Dialogue Chain-of-Thought—a reasoning format decomposing commonsense inference into a sequence of question-answer pairs derived from conversation history
DOCTOR: DialOgue Chain-of-ThOught Reasoner—the specific model (based on OPT-1.3B) trained in this paper to generate rationales
DONUT: DialOgue chaiN-of-thoUght dataseT—the large-scale dataset of 10K dialogues with filtered, high-quality CoT annotations constructed in this paper
Alignment Filters: Mechanisms used to select high-quality rationales; specifically checking for consistency (not hallucinating) and helpfulness (improving response probability)
COMET: Commonsense Transformers—a standard baseline model that generates single-hop commonsense inferences
ATOMIC: A large-scale atlas of everyday commonsense reasoning (If X happens, X likely wants/needs/feels Y)
Self-CoT: A baseline method where the LLM (ChatGPT) prompts itself to generate a rationale before answering, without the proposed filtering/distillation
OPT-1.3B: Open Pre-trained Transformer—a decoder-only language model used as the backbone for the DOCTOR model
Cosmo-3B: A dialogue-specialized language model used here as a base agent to evaluate the helpfulness of generated rationales