← Back to Paper List

PersonaMem-v2: Towards Personalized Intelligence via Learning Implicit User Personas and Agentic Memory

(UPenn) Bowen Jiang, Yuan Yuan, Maohao Shen, Zhuoqun Hao, Zhangchen Xu, Zichen Chen, Ziyi Liu, Anvesh Rao Vijjini, Jiashu He, Hanchao Yu, Radha Poovendran, Gregory Wornell, Lyle Ungar, Dan Roth, Sihao Chen, Camillo Jose Taylor
University of Pennsylvania, Massachusetts Institute of Technology, University of Washington, University of California Santa Barbara, Meta, University of Southern California, University of North Carolina at Chapel Hill, Microsoft Corporation
arXiv, 12/2025 (2025)
Memory P13N Benchmark RL Reasoning

📝 Paper Summary

Conversational personalization Memory recall Memory organization
PersonaMem-v2 introduces a dataset of realistic interactions where users reveal preferences implicitly, and demonstrates that agentic memory trained via reinforcement learning outperforms long-context models in personalization accuracy and efficiency.
Core Problem
Frontier LLMs struggle to infer user personas and preferences from long, noisy conversation histories where users rarely state preferences explicitly but reveal them through everyday tool-use interactions.
Why it matters:
  • Personalization is critical for aligning AI with diverse user needs in education, healthcare, and emotional support, where there is no single correct answer
  • Current models fail to distinguish between user preferences and noise (e.g., hypothetical questions, third-person messages), leading to poor user understanding
  • Existing evaluations often rely on explicit statements, whereas real-world users treat LLMs as tools, revealing preferences only implicitly over time
Concrete Example: A user might ask a chatbot to polish an email, and the content of that email reveals their dining habits (e.g., preference for vegetarian food). The chatbot must infer this preference from the task context without being explicitly told 'I am a vegetarian', while still performing the polishing task.
Key Novelty
Implicit Personalization via Agentic Memory and RL
  • Curates a dataset where user preferences are revealed implicitly across diverse tasks (e.g., email writing, coding) rather than direct statements
  • Trains a model to maintain a single, compact memory summary that updates over time, rather than re-reading full conversation history
  • Uses reinforcement learning to optimize the memory creation process, rewarding the model only when the stored memory leads to correct personalized answers later
Evaluation Highlights
  • Agentic Memory achieves 55% accuracy on implicit personalization tasks, outperforming GPT-5 (approx. 40-48%)
  • Reinforcement fine-tuned Qwen3-4B outperforms GPT-5, reaching 53% accuracy on implicit personalization
  • The agentic memory framework uses 16x fewer input tokens (2k memory vs. 32k history) while achieving state-of-the-art performance
Breakthrough Assessment
9/10
Significantly advances personalization by addressing the harder, realistic problem of implicit preference inference. The agentic memory approach solves the context-scaling bottleneck while beating frontier models.
×