← Back to Paper List

Augmenting Dialog with Think-Aloud Utterances for Modeling Individual Personality Traits by LLM

Seiya Ishikura, Hiroaki Yamada, Tatsuya Hiraoka, Hiroaki Yamada, Takenobu Tokunaga
Institute of Science Tokyo, Mohamed bin Zayed University of Artificial Intelligence, Nara Institute of Science and Technology, Fujitsu Limited
arXiv (2025)
P13N Benchmark

📝 Paper Summary

Conversational personalization User modeling
Augmenting dialog training data with LLM-generated internal thoughts (Think-Aloud Utterances) before each response helps models better mimic specific human personality traits like Agreeableness and Neuroticism.
Core Problem
Modeling non-celebrity personalities is difficult because detailed profiles are scarce, and training on raw dialog alone often fails to capture internal psychological states driving behavior.
Why it matters:
  • Replicating non-celebrity personas is challenging due to the lack of public biographies or detailed profiles available for famous figures.
  • Surface-level utterances in chat logs often lack explicit information about the speaker's internal personality traits and emotional state.
  • Accurate personality modeling is crucial for consistent user interactions in entertainment and personalized agents.
Concrete Example: A speaker with high Neuroticism might say 'I'm in college now,' which seems neutral. However, their internal thought might be 'I feel anxious about the future.' Without the internal thought (TAU), the model misses the anxiety trait underlying the neutral statement.
Key Novelty
Think-Aloud Utterance (TAU) Augmentation
  • Synthetically insert a 'Think-Aloud Utterance' (TAU) before every target speaker's turn in a training dialog, verbalizing their hidden thoughts and feelings using a powerful LLM.
  • Fine-tune the persona model to generate both the internal thought and the final utterance, allowing it to learn the psychological process behind the speech.
Architecture
Architecture Figure Figure 1
Conceptual diagram of the TAU augmentation process
Evaluation Highlights
  • Reduces MSE for Agreeableness and Neuroticism consistently across all tested base models (e.g., gpt-4o-mini MSE drops from 1.662 to 1.571 for Neuroticism) compared to standard dialog training.
  • Higher quality TAUs (generated by GPT-4o vs Qwen) lead to better personality alignment, further reducing MSE for Agreeableness by ~0.09 and Neuroticism by ~0.24 in gpt-4o-mini experiments.
  • Including explicit Big Five scores in the augmentation prompt further improves alignment for Extraversion, Agreeableness, and Neuroticism.
Breakthrough Assessment
6/10
A simple but effective data augmentation technique for personality modeling. Shows promise for internal state modeling, though gains are inconsistent across all Big Five traits and base models.
×