← Back to Paper List

ToolACE-MT: Non-Autoregressive Generation for Agentic Multi-Turn Interaction

Xingshan Zeng, Weiwen Liu, Lingzhi Wang, Liangyou Li, Fei Mi, Yasheng Wang, Lifeng Shang, Xin Jiang, Qun Liu
Huawei Technologies Co., Ltd, Shanghai Jiao Tong University, Harbin Institute of Technology, Shenzhen
arXiv (2025)
Agent Benchmark RL

📝 Paper Summary

Synthetic data generation for agents Multi-turn agentic interaction
ToolACE-MT generates high-quality multi-turn agentic dialogue data efficiently by first creating a coarse trajectory skeleton and then iteratively refining it with complexity injections, avoiding costly autoregressive multi-agent simulations.
Core Problem
Existing methods for generating multi-turn agentic data rely on autoregressive multi-agent simulations (MAS), which are computationally expensive, hard to control for complexity, and prone to error accumulation due to lack of global context.
Why it matters:
  • High-quality multi-turn data is essential for training agents to handle complex real-world tasks involving partial observability and dependent tool calls
  • Autoregressive generation is slow and costly because every turn requires a new inference step based on growing context
  • Assistants in standard simulations lack holistic awareness of the full task plan, leading to inconsistencies and factual errors in long horizons
Concrete Example: In a standard multi-agent simulation, an assistant might call a tool to book a flight without realizing the return date in a later subtask makes the itinerary impossible, because it generates one step at a time. ToolACE-MT plans the full skeleton first, ensuring the dates align before filling in the dialogue.
Key Novelty
Non-Autoregressive Iterative Generation for Agentic Data
  • Decouples structure from content: Generates a complete dialogue skeleton (user tasks + tool actions) first, then fills in natural language details, unlike standard methods that generate them simultaneously turn-by-turn
  • Iterative Refinement via Mask-and-Fill: Systematically injects complexity (e.g., user errors, clarifications) into the skeleton by masking specific turns and regenerating them, similar to non-autoregressive translation methods
  • Global planning consistency: By generating the action trajectory upfront based on a plan, the assistant's behavior remains consistent across long horizons
Architecture
Architecture Figure Figure 2
The overall workflow of ToolACE-MT, illustrating the three stages: Initialization, Iterative Refinement, and Offline Verification.
Evaluation Highlights
  • Models trained on ToolACE-MT data outperform those trained on autoregressive MAS data on benchmarks like BFCL-v3, τ-bench, and ACEBench
  • Efficient scaling: The iterative refinement process allows flexible complexity scaling without the linear cost increase of full autoregressive regeneration
  • Data analysis confirms the generation pipeline produces diverse and valid agentic trajectories suitable for training tool-use LLMs
Breakthrough Assessment
8/10
Offers a significant paradigm shift from expensive autoregressive simulation to efficient non-autoregressive generation for agentic data, addressing key bottlenecks in cost and controllability.
×