โ† Back to Paper List

COMIC: Agentic Sketch Comedy Generation

Susung Hong, Brian Curless, Ira Kemelmacher-Shlizerman, Steve Seitz
University of Washington, Google
arXiv (2026)
Agent MM RL

๐Ÿ“ Paper Summary

Agentic Video Generation Computational Humor Creative Content Generation
COMIC is a multi-agent framework that generates sketch comedy videos by evolving scripts through competitive tournaments and iteratively refining visual shots using critics aligned with YouTube viewer engagement.
Core Problem
Generating funny, long-form video content is difficult because humor is subjective and context-dependent, while current video models struggle with narrative consistency over long durations.
Why it matters:
  • Standard LLMs often produce 'dad jokes' or clichรฉ puns rather than genuine comedy.
  • Fixed objective functions fail for creative tasks because humor has no single ground truth and evolves with exposure (jokes get stale).
  • Existing video generation pipelines typically produce short, disconnected clips lacking the structural coherence needed for storytelling.
Concrete Example: If you ask a standard model to write a sketch, it might output a generic, unfunny dialogue. COMIC, however, simulates a writers' room where 'island' populations of scripts compete; a losing script about a mundane topic might be rewritten to incorporate a surreal twist from a winning script, eventually evolving into a high-quality sketch.
Key Novelty
Content Optimization via Multi-agent Iterative Competition (COMIC)
  • Replaces fixed reward functions with relative fitness via pairwise tournaments, where losing scripts are updated using feedback from winners (simulating a writer's room).
  • Uses distinct 'islands' of script populations, each governed by different critic personas, to preserve diversity in comedic styles (e.g., slapstick vs. dry wit).
  • Introduces a 'Generate-and-Select' method for critics, creating a pool of diverse evaluator agents and retaining only those that correctly predict real-world YouTube engagement statistics.
Architecture
Architecture Figure Figure 2
The COMIC framework pipeline, detailing the progression from script evolution to video realization.
Evaluation Highlights
  • Outperforms 'Single Best' critic baseline on Studio C, VLDL, and SNL engagement prediction tasks (e.g., +6.5% accuracy on Studio C top-vs-bottom).
  • Achieves state-of-the-art performance in agentic video generation, producing results approaching the quality of professionally produced sketches.
  • Demonstrates effective test-time scaling: increasing the number of rendering iterations directly improves visual quality without retraining.
Breakthrough Assessment
8/10
Significant advance in applying agentic workflows to highly subjective creative tasks. The alignment of critics to real-world YouTube engagement data to serve as a proxy for 'humor' is a clever, impactful methodological contribution.
×