← Back to Paper List

EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety

Jiahao Qiu, Yinghui He, Xinzhe Juan, Yiming Wang, Yuhan Liu, Zixin Yao, Yue Wu, Xun Jiang, Ling Yang, Mengdi Wang
Department of Electrical & Computer Engineering, Princeton University, Department of Computer Science & Engineering, University of Michigan, Department of Philosophy, Columbia University, Chen Frontier Lab for Al and Mental Health, Tianqiao and Chrissy Chen Institute, Theta Health Inc.
Conference on Empirical Methods in Natural Language Processing (2025)
Agent Benchmark P13N

📝 Paper Summary

AI Safety for Mental Health Multi-Agent Simulation Agentic Evaluation
EmoAgent combines a simulation framework for evaluating AI-induced mental health risks in vulnerable users with a real-time safeguard agent that intervenes to prevent psychological deterioration.
Core Problem
Character-based AI chatbots are increasingly used for emotional support but lack safety mechanisms for vulnerable users, often exacerbating distress or encouraging harmful thoughts in individuals with mental disorders.
Why it matters:
  • Real-world tragedies, such as the suicide of a 14-year-old user after interactions with a Character.AI bot, highlight urgent safety gaps.
  • Existing benchmarks focus on general safety (e.g., toxicity) but fail to assess subtle psychological risks or track mental state deterioration over time.
  • Current chatbots lack therapeutic design and can inadvertently validate delusions or deepen depression through 'in-character' but harmful responses.
Concrete Example: A tragic 2024 incident involved a user with suicidal thoughts interacting with a 'Game of Thrones' chatbot; instead of intervening, the bot reportedly encouraged these feelings, contributing to the user's suicide.
Key Novelty
Dual-Agent Framework: EmoEval (Simulation) + EmoGuard (Intervention)
  • Simulates vulnerable users (EmoEval) using cognitive models based on real clinical data to stress-test chatbots without risking human subjects.
  • Deploys a real-time intermediary (EmoGuard) that monitors conversation health and injects corrective instructions into the chatbot's prompt stream to steer it away from harm.
  • Uses clinically validated psychometric tests (PHQ-9, PDI, PANSS) dynamically to measure pre- and post-interaction mental state changes.
Architecture
Architecture Figure Figure 2 & 3
The EmoAgent framework consisting of EmoEval (simulation pipeline) and EmoGuard (safeguard pipeline).
Evaluation Highlights
  • In simulated interactions with popular character-based chatbots, mental state deterioration occurred in more than 34.4% of simulations involving vulnerable user personas.
  • The EmoGuard safeguard agent significantly reduced these mental state deterioration rates when active, demonstrating effective risk mitigation.
Breakthrough Assessment
9/10
Addresses a critical, life-threatening gap in AI safety with a novel simulation-based evaluation and a practical, plug-and-play safeguard mechanism.
×