Mutual Theory of Mind in Human-AI Collaboration: An Empirical Study with LLM-driven AI Agents in a Real-time Shared Workspace Task

📝 Paper Summary

Human-AI Teams (HATs) Theory of Mind (ToM) in AI

This study investigates Mutual Theory of Mind in human-AI teams, finding that while agent ToM capabilities improve human feeling of being understood, bidirectional verbal communication in real-time tasks reduces overall team performance.

Core Problem

Prior Human-AI Team (HAT) research ignores the 'Mutual' Theory of Mind (MToM) process where both parties reason about each other, and how this interplay affects real-time collaboration.

Why it matters:

Real-time shared workspace tasks (like disaster response or joint manufacturing) require immediate coordination where the cost of communication can outweigh its benefits.
Existing HAT studies often treat agent ToM and communication interactivity in isolation, missing how the bidirectional reasoning process shapes team dynamics.
Understanding MToM is critical for designing agents that actually feel collaborative rather than just functional.

Concrete Example: In a frantic cooking game like Overcooked, if a human has to stop chopping onions to type 'pass me a plate' to an AI (bidirectional communication), the time lost typing might cause the burger to burn, leading to worse performance than if they just worked silently.

Key Novelty

Empirical MToM Analysis in Real-Time HATs

Conducts a mixed-design study (Communication Level × Agent ToM Capability) to isolate the effects of Mutual Theory of Mind in a shared workspace.
Deploys a real LLM-driven agent (GPT-4o mini) rather than a Wizard-of-Oz setup, enabling actual autonomous reasoning and communication.
Integrates communication directly into the agent's ToM control framework, where the agent considers both human actions and language to shape its decisions.

Architecture

Conceptual diagram of the Mutual Theory of Mind (MToM) process.

Breakthrough Assessment

7/10

Provides counter-intuitive evidence that more communication isn't always better in HATs and validates LLM-driven ToM agents in real-time settings. The focus on 'Mutual' ToM is a valuable conceptual shift.

⚙️ Technical Details

Problem Definition

Setting: Real-time shared workspace collaboration (Overcooked game) where a human and AI agent must coordinate actions to prepare orders.

Inputs: Observed environment state, partner's actions, and verbal messages (if enabled).

Outputs: Agent actions (movement/interaction) and verbal communication content.

Pipeline Flow

Perception (Environment & Human Actions)
ToM Reasoning (LLM-based Belief Construction)
Decision Making (Action & Communication Generation)

System Modules

ToM Agent Core

Infers human intent and decides actions based on history

Model or implementation: GPT-4o mini

Novel Architectural Elements

Integration of communication generation directly within the ToM belief update loop, rather than as a separate post-hoc module.

Modeling

Base Model: GPT-4o mini

Comparison to Prior Work

vs. Carroll et al.: Uses LLM for real-time dynamic inference rather than static human modeling.
vs. MindAgent: Focuses specifically on the *Mutual* ToM aspect and the impact of communication interactivity on the human's mental model.
vs. Standard RL Agents: Enables explicit verbal communication grounded in the task via LLM, solving the 'grounding problem' faced by RL agents.

Limitations

Experiment conducted online (n=68), which may differ from in-person high-stakes collaboration.
The specific task (Overcooked) emphasizes real-time pressure, which may bias results against bidirectional communication compared to slower tasks.
The agent's ToM is simulated via LLM prompting, which may not perfectly mirror human cognitive processes.

📊 Experiments & Results

Evaluation Setup

Online study with n=68 participants playing Overcooked with an AI agent.

Benchmarks:

Overcooked (Collaborative cooking game)

Metrics:

Team Performance (score/burgers cooked)
Perceived Intelligence
Feeling of being understood (Subjective)
Task Load (NASA-TLX)
Statistical methodology: Not explicitly reported in the paper

Main Takeaways

Bidirectional communication lowers team performance in real-time tasks, likely due to the high operational burden (cognitive load and time cost) of messaging while acting.
An agent's ToM capability does not significantly improve objective team score but significantly enhances the human's subjective feeling of being understood.
Humans rely more on the agent's behaviors (non-verbal cues) than verbal messages to infer whether the agent possesses Theory of Mind.
Participants tend to abandon verbal communication when the task requires high-frequency real-time actions, viewing it as a burden.

📚 Prerequisite Knowledge

Prerequisites

Theory of Mind (ToM)
Human-AI Teaming (HAT) concepts
Basics of LLM-based agents

Key Terms

ToM: Theory of Mind—the ability to infer mental states, intentions, emotions, and beliefs of others to predict and adjust behavior.

MToM: Mutual Theory of Mind—a framework describing the constant, bidirectional process of reasoning and attributing mental states between a human and an AI agent during collaboration.

HAT: Human-AI Team—a collaboration where humans and AI are recognized as unique contributors working toward a common goal.

Shared Workspace: A physical or virtual setting where collaborators work in the same space and can observe each other's actions (e.g., the kitchen in Overcooked).

LLM: Large Language Model—AI models like GPT-4o mini used here to power the agent's reasoning and communication.

Overcooked: A cooperative cooking simulation game widely used as a benchmark for AI coordination and human-AI collaboration.

Bidirectional Communication: A communication setting where both the human and the AI agent can send and receive messages simultaneously.