← Back to Paper List

Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path?

Y. Bengio, Michael K. Cohen, Damiano Fornasiere, J. Ghosn, Pietro Greiner, Matt MacDermott, S. Mindermann, Adam Oberman, Jesse Richardson, Oliver E. Richardson, Marc-Antoine Rondeau, P. St-Charles, David Williams-King
Stanford University, Google DeepMind
arXiv.org (2025)
Memory Agent Reasoning Benchmark

📝 Paper Summary

Agentic AI Social Simulation
Agents constructed from large language models can autonomously simulate the behaviors, attitudes, and social dynamics of 1,000 distinct individuals with high fidelity to real human data.
Core Problem
Social science research is slow, expensive, and difficult to reproduce because it relies on recruiting human participants, while existing AI simulations lack the fidelity and scale to serve as valid proxies.
Why it matters:
  • Traditional social science experiments suffer from the 'replication crisis' and high logistical costs of coordinating human subjects
  • Policymakers and researchers lack tools to 'sandbox' social interventions (e.g., public health messaging) before deploying them in the real world
  • Prior agent simulations were too simplistic or small-scale (e.g., toy environments) to capture the complexity of broad societal dynamics
Concrete Example: In the 'Grammars of Action' replication, human participants played a dictator game where they decided how much money to share. Current approaches using generic LLM personas fail to capture the nuances of how fairness norms shift based on social context, whereas the proposed agents accurately replicated the human distribution of selfish vs. altruistic offers.
Key Novelty
Generative Agent Architecture applied at Mass Scale (1,000 Agents)
  • Instantiates 1,000 distinct agents with unique memories, occupations, and social networks based on real demographic data (like the US Census)
  • Equips agents with a memory stream and reflection mechanism that allows them to retrieve past experiences and synthesize high-level inferences, enabling consistent long-term behavior
  • Demonstrates 'agent validation' by replicating classic social science experiments (e.g., GSS survey, distinct games) and comparing agent behavior directly to human data
Evaluation Highlights
  • 0.85 correlation between agent and human responses on the General Social Survey (GSS), matching the correlation between human test-retest reliability
  • Replicated the 'Grammars of Action' experiment with high fidelity, capturing 5 distinct social norm patterns (e.g., fairness, selfishness) indistinguishable from human results
  • Agents spontaneously organized a 'Valentine's Day party' in a sandbox simulation, demonstrating emergent social coordination without explicit scripting
Breakthrough Assessment
9/10
A landmark paper scaling generative agents from small toy examples to a statistically significant population of 1,000, validating them against rigorous social science benchmarks. Establishes a new paradigm for computational social science.
×