โ† Back to Paper List

A LLM-based Controllable, Scalable, Human-Involved User Simulator Framework for Conversational Recommender Systems

Lixi Zhu, Xiaowen Huang, Jitao Sang
School of Computer Science and Technology, Beijing Jiaotong University
arXiv (2024)
Recommendation Agent Memory P13N

๐Ÿ“ Paper Summary

User Simulation for Recommender Systems Conversational Recommender Systems (CRS)
CSHI is a modular, plugin-based user simulator framework that uses LLMs to generate realistic, controllable, and scalable user interactions for evaluating conversational recommender systems while preventing data leakage.
Core Problem
Existing LLM-based user simulators rely on 'single-prompt' templates that are hard to control and often leak ground-truth item names into the simulator's input, making evaluations unrealistic.
Why it matters:
  • Evaluating Conversational Recommender Systems (CRS) with real humans is prohibitively expensive and time-consuming.
  • Template-based simulators lack conversational flow, while current LLM simulators suffer from 'data leakage' (knowing the target item too early), rendering evaluation metrics unreliable.
  • Researchers need fine-grained control over simulator personalities and preferences to test diverse scenarios, which single-prompt methods cannot easily provide.
Concrete Example: In current simulators, the prompt often includes the target movie name (e.g., 'Target: Matrix') to guide the LLM. The LLM might accidentally mention 'Matrix' or its specific details (runtime 136 mins) before the recommender actually suggests it, creating an unrealistic shortcut.
Key Novelty
Plugin-Managed Phased Simulation Framework (CSHI)
  • Decomposes user simulation into distinct stages (Profile Init, Preference Init, Message Handling), managed by a central plugin manager rather than a single giant prompt.
  • Introduces a 'known vs. unknown' preference split: the simulator knows its general tastes (known) but discovers specific latencies (unknown) only when the CRS reveals them, mimicking real discovery.
  • Anonymizes sensitive item attributes (e.g., changing 'released June 1, 2012' to 'the 2010s') to prevent the simulator from leaking unique identifiers during conversation.
Architecture
Architecture Figure Figure 1
The overall framework of CSHI, illustrating the interaction between the User Simulator and the CRS.
Evaluation Highlights
  • CSHI-based simulator produces feedback closely mirroring real users, facilitating reliable assessment of CRS.
  • The framework supports both manual profile editing (Human-Involved) and automated LLM generation, adapting to diverse conversational settings.
  • Successfully demonstrates scalability by allowing expansion/reduction of plugins for personalized requirements.
Breakthrough Assessment
7/10
Addresses the critical 'data leakage' flaw in LLM-based user simulation with a sensible architectural change (plugin system). While performance metrics are qualitative/demonstrative, the structural contribution to CRS evaluation is significant.
×