← Back to Paper List

ProtAgents: protein discovery via large language model multi-agent collaborations combining physics and machine learning

Alireza Ghafarollahi, Markus J. Buehler
Massachusetts Institute of Technology
Digital Discovery (2024)
Agent RAG Reasoning

📝 Paper Summary

Multi-agent Tool profiling Multi-task planning
ProtAgents is a multi-agent framework where specialized LLM agents collaborate using physics simulators, generative models, and retrieval tools to automate complex protein design and analysis tasks.
Core Problem
Current protein design methodologies often rely on isolated AI models that lack flexibility, cannot easily integrate out-of-domain knowledge or physics-based simulations, and struggle with complex multi-step reasoning.
Why it matters:
  • The protein sequence space is vast (over 20^100 possibilities), requiring efficient navigation tools beyond simple surrogate models.
  • Combining data-driven tools with physics-based modeling is crucial for accurate predictions but difficult to automate in a single workflow.
  • Existing tools often require significant human intervention to bridge the gap between literature retrieval, structural design, and physical property analysis.
Concrete Example: A user asks for protein names with specific experimental properties, then wants their PDB IDs, and finally wants to simulate natural frequencies only for those under a certain length. A standard model might hallucinate IDs or fail to execute the conditional logic (checking length before simulating), whereas ProtAgents coordinates a planner to schedule the checks and an assistant to run the physics code.
Key Novelty
LLM-driven Multi-Agent Collaboration with Physics Integration
  • Deploys a team of specialized agents (Planner, Assistant, Critic) that converse to solve problems, rather than a single model attempting all tasks.
  • Integrates 'hard' physics tools (solving partial differential equations for vibrational frequencies) directly into the agent's action space alongside 'soft' knowledge retrieval.
  • Utilizes a 'Critic' agent to autonomously identify errors in plans or code outputs (e.g., malformed JSON) and suggest corrections without human-in-the-loop.
Architecture
Architecture Figure Figure 1
Overview of the ProtAgents multi-agent framework, showing the interaction between User, Chat Manager, and Agents (Planner, Assistant, Critic)
Evaluation Highlights
  • Successfully executed a multi-step workflow involving protein design (Chroma), folding (OmegaFold), and physics simulation (Normal Mode Analysis) without human intervention.
  • The 'Critic' agent autonomously detected and fixed a JSON formatting error that caused a function failure, enabling the system to save results successfully.
  • Correctly applied conditional logic: identified that protein '1hz6' (length 216) exceeded the length limit of 128 and skipped subsequent expensive computations.
Breakthrough Assessment
7/10
Strong application of multi-agent frameworks to scientific discovery. The integration of physics solvers is significant, though the underlying agent architecture (AutoGen-style) is an application of existing methods rather than a fundamental architectural shift.
×