← Back to Paper List

Multi-agent Architecture Search via Agentic Supernet

Guibin Zhang, Luyang Niu, Junfeng Fang, Kun Wang, Lei Bai, Xiang Wang
National University of Singapore, Tongji University, Nanyang Technological University, Shanghai AI Laboratory, University of Science and Technology of China
arXiv (2025)
Agent Reasoning Benchmark

📝 Paper Summary

Automated Multi-Agent System Design Agentic Architecture Search Resource-Efficient Agents
MaAS replaces static multi-agent workflows with an agentic supernet that dynamically samples query-specific architectures, balancing performance with token costs via differentiable and textual gradient optimization.
Core Problem
Existing automated agent design methods search for a single, complex, 'one-size-fits-all' workflow, which is inefficient for simple queries and fails to adapt to diverse domains within a single benchmark.
Why it matters:
  • Deploying complex multi-agent systems for simple tasks (e.g., elementary arithmetic) wastes significant computational resources and money (token costs)
  • Static architectures struggle with heterogeneous benchmarks (e.g., GAIA) where some tasks need web search while others need file reading, forcing practitioners to split datasets manually
  • Current SOTA methods like AFlow optimize for performance but ignore the prohibitive inference costs of massive agent teams
Concrete Example: For a simple arithmetic question like '2+2', current systems might trigger a complex multi-agent debate consuming thousands of tokens. Conversely, for a Ph.D.-level algebra problem, a simple chain-of-thought fails. A static system cannot optimally handle both.
Key Novelty
Agentic Supernet (MaAS)
  • Paradigm shift from searching for one optimal graph to optimizing a probability distribution over many possible agent architectures (the supernet)
  • Introduces a controller that inspects the query difficulty and samples a custom multi-agent topology (e.g., simple I/O for easy tasks, multi-turn debate for hard ones) per instance
  • Optimizes discrete agent components (prompts, tools) using textual gradients while optimizing architecture probabilities using differentiable sampling
Architecture
Architecture Figure Figure 2
The overall framework of MaAS, showing how a query is processed by a controller to sample a subnetwork from the agentic supernet.
Evaluation Highlights
  • Achieves 51.82% accuracy on MATH benchmark with only $0.42 inference cost, compared to AFlow's 51.28% accuracy at $1.66 cost (approx. 4x cheaper)
  • Outperforms state-of-the-art automated methods by 0.54% to 16.89% across six benchmarks including HumanEval and GSM8K
  • Reduces training costs significantly: optimizes in 53 minutes for $3.38 on MATH, whereas comparable baseline AFlow requires 184 minutes and $22.50
Breakthrough Assessment
8/10
Significantly advances automated agent design by solving the efficiency vs. performance trade-off. The concept of an 'agentic supernet' with dynamic routing is a strong conceptual leap over static graph search.
×