← Back to Paper List

AFlow: Automating Agentic Workflow Generation

Jiayi Zhang, Jinyu Xiang, Zhaoyang Yu, Fengwei Teng, Xiong-hui Chen, Jiaqi Chen, Mingchen Zhuge, Xin Cheng, Sirui Hong, Jinlin Wang, Bingnan Zheng, Bangbang Liu, Yuyu Luo, Chenglin Wu
DeepWisdom, The Hong Kong University of Science and Technology (Guangzhou), Renmin University of China, Nanjing University, Fudan University, King Abdullah University of Science and Technology, Université de Montréal & Mila, The Hong Kong University of Science and Technology
International Conference on Learning Representations (2024)
Agent Reasoning Benchmark

📝 Paper Summary

Automated Agentic Optimization Workflow Generation
AFLOW automates the creation of agentic workflows by treating them as code-based search problems, using Monte Carlo Tree Search to iteratively refine structures and prompts for higher performance and lower cost.
Core Problem
Manually designing agentic workflows (sequences of LLM invocations) requires significant human effort and limits scalability, while existing automated methods struggle with limited search spaces or inefficient exploration.
Why it matters:
  • Human-designed workflows are hard to scale to new domains and lack transferability
  • Existing automated methods (like ADAS) use linear heuristic search that fails to discover effective workflows efficiently
  • Optimizing workflows allows smaller, cheaper models to outperform larger, expensive models, democratizing advanced AI capabilities
Concrete Example: In GSM8K (math), a standard Chain-of-Thought workflow might fail on complex reasoning. Manually adding 'Review' or 'Ensemble' nodes is tedious. AFLOW automatically discovers a workflow that generates 5 solutions, ensembles them, and verifies the result using a Python programmer node, improving success rates without human design.
Key Novelty
MCTS-driven Search over Code-Represented Workflows
  • Reformulates workflow optimization as a search over code, where nodes are LLM calls and edges are logic (loops, conditionals)
  • Uses Monte Carlo Tree Search (MCTS) to navigate this infinite space, treating code modifications as search steps and storing successful patterns in the tree
  • Introduces 'Operators' (predefined code blocks like Ensemble or Review) to accelerate search, while allowing the LLM optimizer to write custom logic
Architecture
Architecture Figure Figure 3
The overall AFLOW framework and iterative search cycle
Evaluation Highlights
  • Outperforms state-of-the-art baselines by 5.7% on average across 6 benchmarks (including Math, Code, and QA)
  • Surpasses ADAS (previous SOTA automated method) by 19.5% on average, with a 57% improvement on MATH lv5 and MBPP
  • Enables GPT-4o-mini (via AFLOW) to outperform GPT-4o on HumanEval (94.7% vs 93.9%) at only 4.55% of the inference cost
Breakthrough Assessment
9/10
Significant leap in automating agent design. The ability for smaller models to beat larger ones via discovered workflows is a major efficiency breakthrough. The move to code-based MCTS search solves the expressivity limits of graph-based methods.
×