← Back to Paper List

Diagnostic-Guided Dynamic Profile Optimization for LLM-based User Simulators in Sequential Recommendation

Hongyang Liu, Zhu Sun, Tianjun Wei, Yan Wang, Jiajie Zhu, Xinghua Qu
arXiv (2025)
Recommendation Agent Memory P13N

📝 Paper Summary

User Simulation Agentic Recommender Systems
DGDPO refines user simulator profiles by iteratively diagnosing defects using a specialized model and correcting them with a generalized model, enabling realistic multi-round evolution with sequential recommenders.
Core Problem
Existing LLM-based user simulators rely on static, single-step profiles that cannot correct initial inaccuracies or adapt to evolving interests, and fail to simulate realistic multi-round feedback loops.
Why it matters:
  • Static profiles cause simulated behavior to progressively diverge from real user actions as errors persist uncorrected
  • Current simulators mostly use single-round interactions with static recommenders, failing to capture how real users and systems mutually adapt over time
  • General-purpose LLMs hallucinate when asked to self-diagnose profile defects, leading to unreliable updates
Concrete Example: If an initial profile incorrectly states a user 'dislikes comedy' (Inaccurate), the simulator will consistently reject comedy recommendations. A standard LLM might fail to identify this contradiction from interaction history, whereas DGDPO diagnoses the specific defect and updates the profile.
Key Novelty
Diagnostic-Guided Dynamic Profile Optimization (DGDPO)
  • Decouples the optimization into a 'Diagnostic' phase (identifying specific defects like inaccuracy or incompleteness) and a 'Treatment' phase (generating fixes)
  • Uses a 'Specialized' small LLM for reliable diagnosis (trained on synthetic defects) and a 'Generalized' large LLM for complex reasoning and profile rewriting
  • Integrates the simulator with Sequential Recommenders (SRs) to enable a bidirectional evolution where both the user profile and the recommender strategy update based on interaction history
Architecture
Architecture Figure Figure 2
The DGDPO framework workflow involving the Diagnostic Module, Treatment Module, and interaction with Sequential Recommenders.
Evaluation Highlights
  • Specialized diagnostic module achieves 92.20% average accuracy on profile defect identification
  • General-purpose LLMs (without specialized training) achieve only 62.78% accuracy on the same defect identification task
  • Demonstrates effective identification of 'Inaccurate', 'Incomplete', and combined profile defects compared to baselines
Breakthrough Assessment
7/10
Addresses a critical bottleneck in user simulation (static/hallucinated profiles) with a logical diagnostic-treatment split. The integration with Sequential Recommenders for bidirectional evolution is a significant step toward realistic simulation.
×