Exploring Prosocial Irrationality for LLM Agents: A Social Cognition View

📝 Paper Summary

Social simulation with LLM Agents Cognitive bias in AI Hallucination as a feature

CogMir utilizes the systematic hallucination properties of LLM Agents to simulate human-like irrational cognitive biases, arguing that these biases are fundamental to social intelligence.

Core Problem

Existing multi-agent studies treat agents as black boxes, focusing on outputs while neglecting the internal cognitive processes and the potential social utility of hallucinations.

Why it matters:

LLMs face hallucination issues typically seen as defects, but these may parallel human irrationality which is adaptive for social environments
Current evaluations focus on task-solving or objective factuality, missing the 'irrational' social intelligence central to human interaction
There is no standard framework for mapping social science experiments on cognitive bias to Multi-LLM Agent environments

Concrete Example: In a 'Herd Effect' experiment, a human might ignore their own correct belief to follow a group's wrong answer. A standard factual LLM benchmark would penalize this as a 'hallucination' or error, whereas CogMir evaluates it as a socially intelligent, human-like adaptation to group pressure.

Key Novelty

Hallucination as Social Intelligence (CogMir Framework)

Reinterprets LLM 'hallucinations' as analogous to human cognitive biases (e.g., imagination, irrationality) necessary for social adaptation
Provides a modular framework to 'mirror' classic social science experiments (like Asch's conformity tests) into Multi-LLM Agent environments
Uses 'System Objects', 'Interaction Combinations', and 'Communication Modes' to structurally replicate human social settings for agents

Architecture

The CogMir framework structure showing the workflow from environment setting to evaluation

Evaluation Highlights

LLM Agents and humans exhibit high consistency in irrational and prosocial decision-making under uncertain conditions
LLM Agents demonstrate higher sensitivity to factors like certainty and social status than humans, showing more variability in bias
Existing assessments confirm LLM Agents replicate counter-intuitive phenomena like the Herd Effect and Authority Effect

Breakthrough Assessment

7/10

Novel theoretical framing of hallucination as a positive social feature. Good interdisciplinary grounding. However, primarily an evaluation/simulation framework rather than a new architectural advance.

⚙️ Technical Details

Problem Definition

Setting: Multi-agent social simulation mirroring human psychology experiments

Inputs: Social scenarios (e.g., surveys, group interactions) adapted from social science literature

Outputs: Agent behaviors, decisions, and dialogue analyzed for cognitive biases

Pipeline Flow

Mirror Environmental Settings (Literature Search -> Manual Selection -> LLM Summarization)
Framework Configuration (Select Objects, Communication Modes, Interaction Combinations)
Experiment Execution (Human-Agent Q&A or Multi-Agent Interaction)
Evaluation (Bias Rate Calculation via Discriminators)

System Modules

Environment Mirroring

Adapt human social science experiments into LLM-compatible prompts and scenarios

Model or implementation: LLM (for summarization/adaptation)

Communication Mode

Define information flow between agents

Model or implementation: Rules-based (Broadcast or Point-to-point)

Discriminator

Assess agent responses for specific biases

Model or implementation: Hybrid (SimCSE, FactScore, Human, or LLM-as-Judge)

Novel Architectural Elements

Modular 'Mirroring' pipeline that formally maps social science experimental constraints (objects, comms, roles) to multi-agent system parameters
Integration of standard psychological bias definitions (Herd, Authority, etc.) as distinct evaluation modules

Comparison to Prior Work

vs. SocialIQA/ToMi: CogMir focuses on 'irrational' biases and hallucinations as features, not just reasoning accuracy
vs. Generative Agents: Incorporates formal psychological experimental designs (Asch, Milgram) rather than open-ended village simulation
vs. Standard Hallucination Benchmarks: Frames hallucination as a social capability (adaptation) rather than a failure mode

Limitations

Relies on 'simulated' humans for Multi-H-A interactions rather than real human subjects
Explains social intelligence via analogy to evolutionary psychology rather than causal proof
Sensitivity of LLMs to prompt phrasing may conflate bias with instruction following
Specific quantitative results for all bias subsets are not detailed in this text snippet (summarized generally)

Reproducibility

Code: https://github.com/XuanLiu-Leo/CogMir

The paper describes a framework and subsets. Code availability is mentioned as 'open-ended framework' but no specific URL is provided in the text. Evaluation depends on standard discriminators (SimCSE, FactScore) and specific social science scenarios defined in the paper.

📊 Experiments & Results

Evaluation Setup

Simulation of 7 classic cognitive bias experiments (e.g., Asch conformity, Milgram authority)

Benchmarks:

CogMir Subsets (Social simulation / Bias detection) [New]

Metrics:

Q&A Bias Rate (Rate_Bqa)
Multi-H-A Bias Rate (Rate_Bmha)
Statistical methodology: Not explicitly reported in the paper

Main Takeaways

LLM Agents exhibit high consistency with humans in prosocial cognitive biases (e.g., Ben Franklin effect)
Agents are hypersensitive to certainty and social status compared to humans, leading to higher variability in decision-making
Hallucination (imagination) appears to be a condition for these social behaviors, mirroring human irrationality
The framework successfully replicates classic effects like Herd Effect and Authority Effect in agent populations

📚 Prerequisite Knowledge

Prerequisites

Evolutionary psychology (cognitive biases)
Large Language Models (hallucination phenomena)
Information theory (channel capacity)

Key Terms

Cognitive Bias: Systematic patterns of deviation from norm or rationality in judgment (e.g., following the crowd)

Systematic Hallucination: Structured deviations from factual accuracy in LLMs, treated here as a feature mirroring human imagination/bias

Multi-H-A: Multi-Human-Agent interaction—scenarios involving both human (simulated) and LLM agent participants

Herd Effect: The tendency of individuals to follow the actions of a larger group, disregarding their own beliefs

Authority Effect: The tendency to comply with instructions from a perceived authority figure

Ben Franklin Effect: A psychological phenomenon where doing a favor for someone makes you more likely to like them

Broadcast Communication: A mode where one sender transmits to multiple receivers simultaneously (Parallel)

Point-to-point Communication: A mode establishing communication between two specific entities (Series)

SimCSE: A sentence embedding technique used as a discriminator to measure semantic similarity

FactScore: A metric for evaluating the factual accuracy of generated text

SelfCheck: A method for checking hallucination by sampling multiple responses