← Back to Paper List

Scaling Small Agents Through Strategy Auctions

Lisa Alazraki, William F. Shen, Yoram Bachrach, Akhil Mathur
Meta Superintelligence Labs, Imperial College London, University of Cambridge
arXiv (2026)
Agent Memory Benchmark

πŸ“ Paper Summary

Multi-agent coordination Agentic workflow optimization Model routing
SALE is a marketplace-inspired framework where heterogeneous agents bid with strategic plans to win tasks based on cost-value scoring, refining their bids over time using auction memory to improve small-agent performance.
Core Problem
Small agents perform well on simple tasks but degrade on complex long-horizon ones, while always using large agents is cost-inefficient; existing predictive routers struggle with agentic workflows.
Why it matters:
  • Applying large models to every task is prohibitively expensive for long-horizon agentic workflows involving thousands of tokens.
  • Simple routing based on task descriptions fails because short prompts don't capture the complexity of the required reasoning trajectory.
  • Static routers do not allow smaller agents to improve or adapt to the workload distribution over time.
Concrete Example: On a complex coding task taking humans ~1 hour, the smallest agent (4B) achieves only ~17% of the largest agent's success rate. However, a router that just picks the largest model wastes money on the ~92% of simple tasks that the 4B model could solve perfectly well.
Key Novelty
Strategy Auctions for Workload Efficiency (SALE)
  • Agents 'bid' for tasks by generating short strategic plans rather than full solutions; these plans are scored for cost (length) and value (entropy + peer/self-review).
  • Uses an auction mechanism where smaller agents can 'upskill' by retrieving past successful strategies from a shared memory and refining their bids before the final winner is chosen.
Evaluation Highlights
  • Reduces reliance on the largest agent by 53% and overall cost by 35% across deep search and coding tasks compared to using the largest agent alone.
  • Consistently improves upon the largest agent's pass@1 accuracy (+3.5% on deep search, +2.7% on coding) despite lower costs.
  • Outperforms established predictive routers (Willingness-to-Pay, CARROT) which either fail to reduce costs significantly or degrade performance on complex tasks.
Breakthrough Assessment
8/10
Strong conceptual novelty in applying auction theory to agent routing. Demonstrates that small agents can be 'scaled up' at test time via strategy refinement, shifting the Pareto frontier beyond any single model.
×