← Back to Paper List

HealthFlow: A Self-Evolving AI Agent with Meta Planning for Autonomous Healthcare Research

Yinghao Zhu, Yifan Qi, Zixiang Wang, Lei Gu, Dehao Sui, Haoran Hu, Xichen Zhang, Ziyi He, Liantao Ma, Lequan Yu
Peking University, The University of Hong Kong, The Hong Kong University of Science and Technology, Shanghai Artificial Intelligence Laboratory
arXiv.org (2025)
Agent Memory Benchmark Reasoning

📝 Paper Summary

Self-evolving Agentic reasoning Agentic healthcare data analysis
HealthFlow is an autonomous healthcare agent that improves its high-level research strategies over time by distilling successful task executions into a structured experience memory.
Core Problem
Current AI agents rely on static, hard-coded strategic frameworks for task decomposition, preventing them from learning how to orchestrate complex healthcare workflows or adapt plans based on previous failures.
Why it matters:
  • Healthcare research involves open-ended problems and noisy data where rigid, predefined strategies often fail to adapt to intermediate findings
  • Existing agents optimize tool usage (component-level) but cannot refine their overarching management policy, limiting autonomy in high-stakes domains
  • The lack of meta-level learning means agents repeat strategic errors rather than accumulating procedural wisdom like human researchers
Concrete Example: In a data visualization task involving blood pressure, a standard agent creates a plot immediately, ignoring outliers that distort the scale. HealthFlow, recalling a 'warning' experience from a prior task, proactively inserts a data filtering step to remove unrealistic values before plotting, ensuring interpretability.
Key Novelty
Meta-Level Evolution via Structured Experience Memory
  • Treats every completed task as a learning opportunity by reflecting on execution traces to synthesize durable 'experiences' (heuristics, code snippets, warnings)
  • Updates the agent's high-level planning policy by retrieving these structured experiences for new tasks, allowing it to start with better strategies rather than just better tools
  • Decouples strategic evolution (learning *how* to plan) from static execution, moving beyond simple tool-library expansion found in prior work
Architecture
Architecture Figure Figure 1
The self-evolving architecture of HealthFlow, detailing the interaction between the four agents (Meta, Executor, Evaluator, Reflector) and the Experience Memory.
Evaluation Highlights
  • +15.99pp success rate on MedAgentBoard (81.89% vs 65.90% for next best) when using ToolUniverse
  • Achieves 3.98/5.0 on the new EHRFlowBench, significantly outperforming general agent AFlow (3.31) and biomedical agent STELLA (2.39)
  • Dominant win rate in head-to-head comparisons on EHRFlowBench (e.g., >90% win rate against Biomni and STELLA)
Breakthrough Assessment
8/10
Significant step in self-evolving agents by formalizing meta-level strategic learning rather than just tool tuning. Strong empirical gains on complex tasks, though reliant on closed-source LLM backbones.
×