← Back to Paper List

LLM-Augmented Digital Twin for Policy Evaluation in Short-Video Platforms

Haoting Zhang, Yunduan Lin, Jinghai He, Denglin Jiang, Zuo-Jun, Shen, Zeyu Zheng
Unknown
arXiv (2026)
Agent Recommendation Memory P13N Benchmark

📝 Paper Summary

Multi-agent simulation Social platform simulation
A modular digital twin architecture for short-video platforms that integrates selective LLM usage with event-driven simulation to enable safe, counterfactual policy evaluation under realistic closed-loop feedback.
Core Problem
Evaluating policies on short-video platforms is difficult because production A/B tests are risky and noisy, while existing simulators lack the semantic realism and strategic user adaptation needed for accurate counterfactuals.
Why it matters:
  • Platform interventions (ranking, moderation) are ethically sensitive and can introduce unfairness or social harm if deployed without rigorous testing
  • Closed-loop feedback (exposure shapes behavior which shapes future exposure) makes causal attribution difficult in live environments
  • Existing agent-based simulators rely on simplified rules that miss semantic nuances, while pure LLM agents are too slow and expensive for platform-scale simulation
Concrete Example: A change in recommendation logic might initially boost engagement, but if creators strategically adapt by producing lower-quality clickbait, long-term retention drops. Standard simulators with static agent rules fail to predict this co-evolution.
Key Novelty
Four-Twin Architecture with Tiered LLM Execution
  • Decomposes the ecosystem into four distinct 'twins' (User, Content, Interaction, Platform) interacting solely through a typed event bus, allowing isolated replacement of policy components
  • Implements a 'Live/Cached/Surrogate' execution tier that selectively uses LLMs for high-value tasks (personas, captions) and falls back to cheaper heuristics to balance fidelity with scale
Breakthrough Assessment
7/10
Proposes a robust architecture for a critical industrial problem (platform policy evaluation). The hybrid execution model addresses the key cost/realism bottleneck in LLM simulations, though experimental validation is missing from the provided text.
×