SFT: Supervised Fine-Tuning—training a model on labeled examples (input-output pairs) to learn a specific task
DPO: Direct Preference Optimization—a method to align models with human/system preferences by contrasting preferred vs. rejected outputs
Function Calling: The capability of an LLM to generate structured outputs (like JSON) that invoke specific software functions with correct arguments
LoRA: Low-Rank Adaptation—a parameter-efficient fine-tuning technique that freezes pre-trained weights and trains small rank decomposition matrices
AST: Abstract Syntax Tree—a tree representation of the abstract syntactic structure of source code, used here to evaluate the correctness of generated function calls
APIGen: A data synthesis framework used by the authors to generate verifiable function-calling datasets by executing the calls to ensure validity
Mixture-of-Experts (MoE): A model architecture that uses multiple sub-networks ('experts') and a gating mechanism to activate only a subset of them for each input token
FSDP: Fully Sharded Data Parallel—a memory optimization technique for distributed training that shards model parameters, gradients, and optimizer states across GPUs
Hallucination: When a model generates incorrect or non-existent information, such as inventing function arguments that weren't in the user query