← Back to Paper List

IronEngine: Towards General AI Assistant

Xi Mo
NiusRobotLab
arXiv (2026)
Agent Memory Reasoning MM

📝 Paper Summary

Layered memory Multi-agent Multi-call tool use with flexible plan
IronEngine is a comprehensive local-first AI assistant framework that decouples planning from execution via a three-phase pipeline, utilizing heterogeneous models and hierarchical memory to solve fragmentation and reliability issues.
Core Problem
Current AI assistants suffer from fragmentation (disjoint tools), single-model bottlenecks (inefficient resource use), ephemeral memory (stateless sessions), and poor local deployment reliability.
Why it matters:
  • Users must switch between disjoint tools for different tasks (web, desktop, files) rather than using a unified interface
  • Single-model designs waste compute by using large models for simple formatting or fail at complex planning with small models
  • Lack of structured persistence forces users to re-teach preferences and workflows in every new session
  • Privacy-sensitive workloads require local execution, but managing VRAM for multiple models on consumer hardware is unsolved in most frameworks
Concrete Example: A small 3.8B parameter tool model might generate valid JSON but specify the wrong tool type (e.g., 'web_read' instead of 'web_search'). Standard systems fail outright, whereas a robust system should detect the semantic mismatch and redirect the request automatically.
Key Novelty
Unified Orchestration with Heterogeneous Model Allocation
  • Decouples cognitive roles into a three-phase pipeline (Discussion, Model Switch, Execution), assigning different model sizes to Planner (reasoning), Reviewer (quality gate), and Executor (tool use)
  • Implements a VRAM-aware model lifecycle that dynamically loads and unloads models on a single GPU to overcome hardware constraints
  • Features a dual-merge hierarchical memory system that combines fast hash-based deduplication with model-based daily consolidation for long-term retention
Evaluation Highlights
  • Reduces tool dispatch failures by an order of magnitude using alias normalization and automatic error correction compared to direct model routing
  • Successfully manages a 46,690-line codebase with 97 source files, integrating 24 tool categories under one orchestration core
  • Demonstrates desktop automation capability where standard accessibility approaches fail (e.g., WeChat) by falling back to visual analysis
Breakthrough Assessment
7/10
Strong systems engineering contribution addressing practical deployment issues (VRAM, fragmentation) often ignored in pure research. Novelty lies in the integration and lifecycle management rather than new model architectures.
×