← Back to Paper List

From Days to Minutes: An Autonomous AI Agent Achieves Reliable Clinical Triage in Remote Patient Monitoring

Seunghwan Kim, Tiffany H. Kung, Heena Verma, Dilan Edirisinghe, Kaveh Sedehi, Johanna Alvarez, Diane Shilling, Audra Lisa Doyle, Ajit Chary, William Borden, Ming Jack Po
AnsibleHealth Inc., Department of Anesthesiology, Perioperative and Pain Medicine, Stanford School of Medicine, Department of Medicine, Division of Cardiology, George Washington University
arXiv (2026)
Agent RAG Benchmark

📝 Paper Summary

Agentic AI in Healthcare Remote Patient Monitoring (RPM)
Sentinel is an autonomous AI agent that uses 21 structured clinical tools to triage remote patient monitoring data, achieving higher sensitivity for emergencies than individual clinicians.
Core Problem
Remote patient monitoring (RPM) generates overwhelming data floods where most alerts are noise, yet prior trials failed because simple threshold-based filtering lacks the clinical context to distinguish true emergencies.
Why it matters:
  • Landmark heart failure trials (Tele-HF, BEAT-HF) failed to improve outcomes because clinicians were buried in irrelevant alerts, leading to alert fatigue and missed signals
  • Effective monitoring (TIM-HF2) requires 24/7 physician staffing to interpret context, which is prohibitively expensive and unscalable for widespread chronic disease management
Concrete Example: A weight gain of 3 lbs might trigger a rule-based alert for heart failure. A rule-based system flags it blindly. Sentinel retrieves context showing the patient recently increased diuretic dosage and has stable breathing, correctly classifying it as 'Monitor' rather than 'Emergency'.
Key Novelty
Context-Aware Autonomous Clinical Agent (Sentinel)
  • Equips an LLM with 21 structured tools (via Model Context Protocol) to autonomously retrieve patient history, medications, and notes, simulating a clinician's chart review process
  • Replaces fixed rule-based alerts with dynamic multi-step reasoning, allowing the agent to determine what data is necessary to evaluate a specific vital sign reading
Architecture
Architecture Figure Figure 1
System architecture of Sentinel showing the interaction between the AI Agent, the Model Context Protocol (MCP) Host, and the various Tool Services.
Evaluation Highlights
  • Achieved 95.8% sensitivity for emergency classifications (23/24) and 88.5% sensitivity for all actionable alerts (92/104) against a human majority-vote reference standard
  • Outperformed every individual human clinician in leave-one-out analysis for emergency sensitivity (97.5% vs. clinician aggregate 60.0%)
  • Demonstrated almost perfect self-consistency (Fleiss' κ = 0.850) across 5 independent runs, significantly higher than human inter-rater agreement (pairwise exact agreement ~60%)
Breakthrough Assessment
8/10
Demonstrates the first deployed autonomous agent using Model Context Protocol for clinical RPM triage. Significantly outperforms rule-based baselines and individual clinicians in sensitivity, offering a scalable solution to the 'data flood' problem that plagued prior RPM trials.
×