SFT: Supervised Fine-Tuning—training models on static datasets of inputs and target outputs
Online RL: Reinforcement Learning where the agent interacts with a live environment and learns from trial-and-error feedback, rather than static offline data
Trajectory Skeleton: A generated Directed Acyclic Graph (DAG) of tool calls representing the logical steps to solve a task, created before the instruction text
Dependently-typed tools: Tools whose output type is mathematically determined by their input values (e.g., an 'add' function taking two prices returns a price)
Type Generator: A function that creates random valid instances of a specific data type (e.g., generating a random 'hotel-rating' float between 1.0 and 5.0)
Type Recognizer: A boolean function used to validate whether an agent's input matches the required type for a tool
NESTFUL: A benchmark dataset for evaluating tool-use agents, noted for requiring complex reasoning