← Back to Paper List

El Agente Gráfico: Structured Execution Graphs for Scientific Agents

Jiaru Bai, Abdulrahman Aldossary, Thomas Swanick, Marcel Müller, Yeonghun Kang, Zijian Zhang, Jin Won Lee, Tsz Wai Ko, Mohammad Ghazi Vakili, Varinia Bernales, Alán Aspuru-Guzik
University of Toronto, Vector Institute for Artificial Intelligence, McGill University, Acceleration Consortium, NVIDIA
arXiv (2026)
Agent Memory KG Benchmark

📝 Paper Summary

Scientific Agents Structured Execution Agentic Workflows
El Agente Gráfico replaces unstructured text-based agent contexts with typed execution graphs and a persistent knowledge graph to enable scalable, auditable, and cost-efficient scientific workflows.
Core Problem
Current scientific agents rely on unstructured text to manage context, which creates overwhelming information volume, obscures decision provenance, and causes misconfiguration when handling heterogeneous scientific tools.
Why it matters:
  • Numerical correctness and state fidelity are critical in science but are often lost in conversational LLM contexts
  • Large scientific data artifacts (e.g., electronic densities) cannot be efficiently serialized into LLM context windows
  • Multi-agent decomposition to handle context load introduces coordination failures and prohibitive token costs (e.g., >$4 per run in prior baselines)
Concrete Example: In a pKa prediction task, a 'bare' LLM agent (equipped only with code execution/web search) failed to check for imaginary frequencies and confused solvation models, yielding a biologically impossible pKa ≈ -5.0.
Key Novelty
Type-Safe Execution Graphs backed by Knowledge Graph Persistence
  • Embeds LLM decision-making within structured 'execution graphs' where nodes represent validated state transformations (e.g., DFT calculations) rather than free-form text
  • Uses an Object-Graph Mapper (OGM) to serialize Python objects into a persistent Knowledge Graph, allowing 'heavy' scientific data to be referenced by symbolic identifiers (IRIs) instead of raw text
Evaluation Highlights
  • Reduces operating cost by ~96% compared to the multi-agent 'El Agente Q' baseline ($4.67 → $0.17 per run with gpt-5)
  • Achieves >6x speedup in wall-clock time (1,827s → 228s) by eliminating inter-agent communication overhead and enabling parallel execution
  • Outperforms El Agente Q on numerical correctness, achieving 98.88% accuracy with gpt-5 compared to the baseline's 88.25%
Breakthrough Assessment
9/10
Drastically reduces the cost and complexity of scientific agents while improving accuracy. The shift from text-based context to typed execution graphs addresses the fundamental bottleneck of LLM context limits in data-heavy domains.
×