TOD: Task-Oriented Dialogue—systems designed to help users accomplish specific goals like booking hotels
LA: Language Agents—LLMs capable of using external tools (APIs) to perform actions
ReAct: Reasoning and Acting—a prompting technique where models generate reasoning traces before taking actions
CoALM-IT: The integrated multi-task dataset created in this paper, combining TOD, LA, and CRA data
CRA: Conversational ReAct API—a dataset format introduced here where agents use multiple 'think' steps for API decision and response generation
DST: Dialogue State Tracking—estimating the user's goal/intent at each turn of a conversation
BFCL: Berkeley Function Calling Leaderboard—a benchmark for evaluating LLMs' ability to call functions correctly
MultiWOZ: Multi-Domain Wizard-of-Oz—a standard benchmark for multi-turn task-oriented dialogue
LoRA: Low-Rank Adaptation—a parameter-efficient fine-tuning technique
QLoRA: Quantized Low-Rank Adaptation—LoRA applied to quantized models for memory efficiency