โ† Back to Paper List

Cocoa: Co-Planning and Co-Execution with AI Agents

K. Feng, Kevin Pu, Matt Latzke, Tal August, Pao Siangliulue, Jonathan Bragg, Daniel S. Weld, Amy X. Zhang, Joseph Chee Chang
University of Washington, University of Toronto, Allen Institute for AI, University of Illinois Urbana-Champaign
arXiv.org (2024)
Agent Reasoning

๐Ÿ“ Paper Summary

Multi-turn w. user interactions Multi-task planning
Cocoa is a system that enables users to flexibly delegate tasks and interleave planning and execution with AI agents within a document editor, improving steerability in complex research workflows.
Core Problem
Current AI agent systems either rigidly separate planning from execution or force users into reactive error-correction roles, lacking flexibility for users to proactively guide the agent or adapt plans based on intermediate results.
Why it matters:
  • Rigid separation of planning and execution leads to wasted effort if initial plans are flawed
  • Reactive correction (fixing agent errors after they happen) is cognitively demanding and inefficient
  • Scientific research requires tacit knowledge and iterative refinement that fully autonomous agents currently lack
Concrete Example: A researcher planning a literature review might want to see initial search results for a specific query before deciding whether to broaden the search or dive deeper into a specific sub-topic. In current systems, they must wait for the full execution or manually interrupt the agent, whereas Cocoa allows them to execute the first step, inspect the output, and then modify the subsequent plan steps immediately.
Key Novelty
Interleaved Co-Planning and Co-Execution
  • Introduces a computational notebook-like interface within a text document where agent plans are represented as interactive cells
  • Allows explicit delegation of agency: users can assign specific steps to themselves or the agent
  • Enables fluid transition between planning and execution: users can execute a step, pause to edit future steps based on the output, and resume
Evaluation Highlights
  • Participants in a lab study (n=16) successfully used Cocoa to steer agents in research tasks, balancing control with ease of use compared to chat baselines
  • Field deployment (n=7, 1 week) showed researchers valued explicit delegation, using self-assigned steps to inject expert knowledge into the workflow
  • Qualitative feedback indicated that interleaved planning/execution allowed users to catch agent errors early and refine directions without restarting
Breakthrough Assessment
7/10
Significant contribution to human-agent interaction design by successfully adapting the notebook paradigm to general agentic workflows. While not an algorithmic breakthrough, it offers a strong, validated interaction model for steerability.
×