← Back to Paper List

Towards Effective GenAI Multi-Agent Collaboration: Design and Evaluation for Enterprise Applications

Raphael Shu, Nilaksh Das, Michelle Yuan, Monica Sunkara, Yi Zhang
Amazon Web Services
arXiv (2024)
Agent Benchmark Reasoning

📝 Paper Summary

Multi-Agent Systems (MAS) Enterprise AI Applications
A hierarchical multi-agent framework optimizing enterprise task execution through centralized supervision, payload referencing to reduce context size, and dynamic routing to bypass unnecessary orchestration steps.
Core Problem
Designing effective collaboration protocols and evaluating them is challenging for enterprise applications, where latency is critical and tasks exceed single-agent capabilities.
Why it matters:
  • Single agents struggle with complex, multi-faceted enterprise problems that require diverse specializations.
  • Existing evaluation methods often rely on expensive human review or unscalable ground-truth trajectories.
  • Latency in multi-agent systems is often high due to excessive orchestration and redundant context generation.
Concrete Example: A supervisor agent needs to pass a large code snippet from Agent C to Agent B. Standard approaches force the supervisor to regenerate the entire snippet in its output, wasting tokens and increasing latency. This framework uses pointers (payload referencing) instead.
Key Novelty
Optimized Hierarchical Multi-Agent Collaboration (MAC)
  • Models inter-agent communication as a tool use capability, integrating it with existing function calling mechanisms.
  • Introduces 'payload referencing' to pass large content (like code) between agents using lightweight reference tags instead of regenerating full text.
  • Implements 'dynamic routing' where a fast classifier allows simple messages to bypass the central supervisor, reducing latency.
Architecture
Architecture Figure Figure 1
Hierarchical agent structure where a Supervisor Agent manages Leaf Agents (Specialists).
Evaluation Highlights
  • Multi-agent collaboration enhances goal success rates by up to 70% compared to single-agent approaches on the proposed benchmarks.
  • Payload referencing improves performance on code-intensive tasks by 23% while reducing communication overhead per turn by 27%.
  • Dynamic agent routing achieves ≥90% classification accuracy with ~350ms latency, enabling selective bypass of supervisor orchestration.
Breakthrough Assessment
7/10
Solid engineering optimizations for enterprise agents (payload referencing, routing) and a useful benchmark contribution. The hierarchical approach is standard, but the specific efficiency optimizations are valuable for practical deployment.
×