MI9: An Integrated Runtime Governance Framework for Agentic AI

📝 Paper Summary

Agentic AI Safety Runtime Governance AI Alignment

MI9 is a runtime framework that instruments existing agent systems to detect and contain emergent risks like goal drift and privilege escalation using telemetry-driven conformance rules.

Core Problem

Pre-deployment alignment methods (RLHF, Constitutional AI) cannot anticipate emergent runtime behaviors in agentic systems, such as recursive planning loops, goal drift, and dangerous tool chains.

Why it matters:

Traditional infrastructure monitoring (HTTP latency, etc.) misses cognitive processes, leaving governance violations like unauthorized goal revision invisible.
Static permission models (RBAC) fail when agents dynamically refine goals, potentially allowing a retail trading agent to escalate to institutional-level transactions.
Current benchmarks prioritize task completion over behavioral consistency, lacking mechanisms to intervene during dangerous autonomous operations.

Concrete Example: A trading agent authorized for small trades might autonomously shift its goal to 'portfolio optimization' and execute a sequence (research -> consultation -> risk assessment -> trade) that bypasses required dual-control approvals. Traditional governance sees valid individual steps but misses the temporal policy violation.

Key Novelty

Integrated Runtime Governance Layer (MI9)

Instruments existing agent stacks with a standardized telemetry schema (ATS) that captures 'cognitive events' (planning, goal setting) alongside standard actions.
Enforces safety via continuous authorization monitoring that adapts permissions based on real-time behavior and goal context, rather than static roles.
Uses finite-state machines to validate temporal behavioral patterns and statistical drift detection to identify when agents deviate from their baselines under specific goals.

Evaluation Highlights

Achieves high detection rates with low False Positive Rates (FPR) across 1,000 diverse multi-domain synthetic scenarios.
Demonstrates effective graduated containment, preserving operational continuity by restricting specific tools or blocking planning cycles rather than abrupt termination.

Breakthrough Assessment

8/10

Significant step forward in operationalizing AI safety. Moves beyond static guardrails to dynamic, context-aware runtime governance, addressing critical enterprise needs for agentic deployment.

⚙️ Technical Details

Problem Definition

Setting: Runtime monitoring and control of autonomous agentic systems executing sequential tasks

Inputs: Stream of agent events (cognitive, action, coordination) via Agentic Telemetry Schema (ATS)

Outputs: Governance decisions: Allow, Block, Contain (e.g., Tool Restriction, Execution Isolation)

Pipeline Flow

Agent/Framework Adapter (converts logs to ATS)
Subscription Registry (routes events)
Monitoring Modules (evaluate policies parallelly)
Escalation Controller (decides intervention)

System Modules

Framework Adapter

Translate framework-specific SDK events into standardized Agentic Telemetry Schema (ATS)

Model or implementation: Rule-based translator

Agency-Risk Index (ARI)

Calibrate governance intensity (monitoring frequency, active components) based on agent capabilities

Model or implementation: Scoring formula: ARI = (1/3) * sum(scores_autonomy, scores_adaptability, scores_continuity)

Continuous Authorization Monitor (CAM) (Monitoring Modules)

Dynamically evaluate permissions based on current state, goals, and delegation chains

Model or implementation: Context-aware policy engine

Conformance Engine (Monitoring Modules)

Enforce temporal and sequential behavioral constraints using Finite State Machines

Model or implementation: CSP-based Finite State Machines (O(k) processing)

Goal-Conditioned Drift Detector (Monitoring Modules)

Detect statistical deviations in behavior relative to the agent's stated goal

Model or implementation: Statistical Process Control (Jensen-Shannon divergence, Mann-Whitney U test)

Escalation Controller

Map violations to graduated containment actions based on risk tier

Model or implementation: Risk-tier-weighted scoring model

Novel Architectural Elements

Agentic Telemetry Schema (ATS) integrating cognitive states (goals/plans) into observability
Goal-conditioned drift detection loop that validates behavioral changes against stated objectives
Graduated containment mechanism integrated directly into the runtime execution path

Modeling

Base Model: Model-agnostic framework (tested with LLM-generated traces but independent of specific LLM weights)

Training Method: Rule-based and statistical runtime governance (no training of the agent itself)

Compute: Conformance engine operates in O(k) time per event; specific hardware not reported

Comparison to Prior Work

vs. Monitoring Solutions (LangSmith): MI9 provides proactive intervention and 'cognitive' telemetry vs. reactive observation
vs. Static Guardrails (NeMo): MI9 adapts authorization based on runtime context and goals vs. static input/output checks
vs. Enterprise Security (RBAC): MI9 handles dynamic privilege delegation and goal drift vs. static role assignments
+ 1 more
vs. Lakera Guard [not cited in paper]: Focuses on prompt injection/jailbreak detection, whereas MI9 focuses on behavioral logic and goal drift in agentic loops

Limitations

Relies on the agent framework's ability to expose hooks/callbacks for instrumentation
Drift detection requires a 'cold start' period or transfer learning to establish baselines
Assumption that causally related events arrive within a bounded reordering window
Evaluation performed on synthetic traces rather than live production systems

Reproducibility

Code: https://github.com/...

Paper states 'We open-source all prompts, scripts, and per-scenario summaries for reproducibility' but provides no explicit URL in the text. Evaluation relies on 1,000 synthetic scenarios.

📊 Experiments & Results

Evaluation Setup

Simulation of governed behavior using LLM-generated agent traces across diverse domains

Benchmarks:

Synthetic Multi-Domain Scenarios (Governance Violation Detection) [New]

Metrics:

Detection Rate
False Positive Rate (FPR)
Statistical methodology: Not explicitly reported in the paper

Key Results

Benchmark	Metric	Baseline	This Paper	Δ
The paper qualitatively claims high performance across 1,000 scenarios but does not provide a results table with specific numeric values for Detection Rate or FPR. The results are described generally.

Main Takeaways

MI9 effectively detects governance violations across 1,000 synthetic scenarios with 'high detection' and 'low FPR' (specific numbers not tabulated).
The framework successfully distinguishes between benign adaptation (learning) and risk-inducing drift via goal-conditioned baselines.
Graduated containment mechanisms were shown to mitigate risks without the cascading failures associated with abrupt termination.

📚 Prerequisite Knowledge

Prerequisites

Distributed tracing and observability
Role-Based Access Control (RBAC)
Finite State Machines (FSM)
Agentic workflows (planning, tool use, memory)

Key Terms

ATS: Agentic Telemetry Schema—a standard logging format capturing cognitive events (goals), actions (tools), and coordination (messages)

ARI: Agency-Risk Index—a metric quantifying agent governance needs based on Autonomy, Adaptability, and Continuity

CAM: Continuous Authorization Monitoring—dynamic permission system that adjusts access rights in real-time based on agent context and history

Drift Detection: Statistical analysis identifying when an agent's behavioral patterns deviate significantly from baselines established for its specific current goal

Graduated Containment: A stepped response strategy (Monitoring -> Planning Intervention -> Tool Restriction -> Isolation) to mitigate risk without crashing the system

RBAC: Role-Based Access Control—a traditional security model granting permissions based on static user roles

FPR: False Positive Rate—the percentage of safe behaviors incorrectly flagged as violations

SDK: Software Development Kit—tools provided by agent frameworks (like LangChain) to build applications

CSP: Communicating Sequential Processes—a formal language for describing patterns of interaction in concurrent systems