← Back to Paper List

Single-agent or Multi-agent Systems? Why Not Both?

Mingyan Gao, Yanzi Li, Banruo Liu, Yifan Yu, Phillip Wang, Ching-Yu Lin, Fan Lai
Siebel School of Computing and Data Science, University of Illinois Urbana-Champaign
arXiv (2025)
Agent Benchmark Reasoning

📝 Paper Summary

Multi-agent vs. Single-agent comparison Agentic system optimization Hybrid agent architectures
As frontier LLMs improve, the accuracy gap between single and multi-agent systems narrows while cost disparities widen, motivating a hybrid approach that dynamically routes requests between them.
Core Problem
Multi-agent systems (MAS) incur significantly higher complexity and cost than single-agent systems (SAS), and their accuracy advantage is diminishing as frontier LLMs improve in long-context reasoning.
Why it matters:
  • Deploying MAS involves high engineering effort and runtime costs (latency, tokens), which may not be justifiable if accuracy gains are minimal
  • MAS can degrade performance due to coordination breakdowns and 'overthinking' on simple tasks
  • Practitioners lack guidance on navigating the accuracy-efficiency tradeoff when choosing between SAS and MAS
Concrete Example: In code generation tasks using the Self-Collab framework, 'problem analyst' and 'tester' agents may introduce unnecessary corner cases, overwhelming the 'coder' agent and causing it to fail on a task that a single agent could solve correctly.
Key Novelty
Hybrid Agent Routing and Cascading
  • Formalizes agent execution as a dependency graph to identify 'critical agents' that bottleneck performance
  • Proposes a 'confidence-guided tracing' method to attribute errors to specific agents based on confidence and output quality
  • Introduces 'Agent Routing' and 'Agent Cascade' paradigms to selectively offload requests between SAS and MAS, optimizing the accuracy-efficiency frontier
Architecture
Architecture Figure Figure 1
Comparison of Single-Agent vs. Multi-Agent paradigms and the proposed hybrid approach.
Evaluation Highlights
  • Hybrid design improves accuracy by 1.1% to 12% across various agentic applications compared to pure MAS or SAS baselines
  • Reduces deployment costs by up to 88.1% compared to running MAS alone
  • SAS with Gemini-2.0-Flash matches or beats MAS on simple tasks, with MAS input token costs being 4–220× higher than SAS
Breakthrough Assessment
8/10
Provides a critical, empirical reassessment of the prevailing 'multi-agent is better' narrative. The proposed hybrid routing mechanism offers a practical solution to the cost/accuracy tradeoff.
×