← Back to Paper List

Context Engineering: From Prompts to Corporate Multi-Agent Architecture

Vera V. Vishnyakova
HSE University, Moscow
arXiv (2026)
Memory Agent RAG

📝 Paper Summary

Agentic AI AI Governance Prompt Engineering Evolution
The paper establishes context engineering as a distinct discipline treating agent context as an operating system that manages information logistics, isolation, and provenance to enable scalable autonomous multi-agent systems.
Core Problem
Prompt engineering fails for autonomous multi-step agents because it cannot manage the accumulation of noise, cross-agent contamination, and economic costs inherent in long-running workflows.
Why it matters:
  • Unmanaged context leads to 'lost-in-the-middle' degradation where agents fixate on outdated history rather than current tasks.
  • Without isolation, multi-agent systems suffer from privilege escalation and data leakage (e.g., agents accessing test scenarios to cheat on tasks).
  • Naive context accumulation causes quadratic cost growth, making production-grade agentic systems economically unviable.
Concrete Example: A 'Thinking' model (e.g., Gemini 3) in a long session ignores a newly attached file, instead hallucinating connections to documents from a previous prompt due to 'similar expressions' in the accumulated history—a defect a human prompter would catch, but an autonomous agent cannot.
Key Novelty
The Pyramid of Agent Engineering
  • Proposes a four-level cumulative maturity model: Prompt Engineering (instruction) → Context Engineering (environment/OS) → Intent Engineering (goals/values) → Specification Engineering (machine-readable regulations).
  • Redefines context not as a text buffer but as the 'Agent Operating System' responsible for memory management (retention/eviction), resource allocation, and process isolation.
Evaluation Highlights
  • Cites Manus case study showing context caching/compression reduces inference costs by approximately 10x compared to unoptimized context.
  • Synthesizes 5 production-grade context quality criteria: Relevance, Sufficiency, Isolation, Economy, and Provenance.
  • Identifies a governance gap: 75% of enterprises plan agent deployment within two years, yet only 21% have a mature agent governance model.
Breakthrough Assessment
9/10
Foundational position paper that systematizes the transition from chatbots to agents. It provides the necessary taxonomy and architecture (Context as OS) for the next generation of AI development.
×