SLM: Small Language Model—a language model with significantly fewer parameters (e.g., <1B) designed for efficiency and specific tasks
SFT: Supervised Fine-Tuning—training a pre-trained model on a labeled dataset of inputs and desired outputs
ToolBench: A comprehensive benchmark for evaluating tool manipulation capabilities, covering over 16,000 real-world APIs
ReAct: Reasoning and Acting—a paradigm where models alternate between generating thoughts (reasoning traces) and taking actions (tool calls)
CoT: Chain-of-Thought—a prompting technique that encourages models to generate intermediate reasoning steps before the final answer
DFS: Depth-First Search—a search strategy used in ToolLLaMA to explore solution paths
LoRA: Low-Rank Adaptation—a parameter-efficient fine-tuning technique
Gradient Checkpointing: A technique to reduce memory usage during training by not saving all intermediate activations
Mixed Precision: Using lower precision (e.g., FP16) for calculations to speed up training and reduce memory usage