← Back to Paper List

In Prospect and Retrospect: Reflective Memory Management for Long-term Personalized Dialogue Agents

Zhen Tan, Jun Yan, I-Hung Hsu, Rujun Han, Zifeng Wang, Long T. Le, Yiwen Song, Yanfei Chen, Hamid Palangi, George Lee, Anand Iyer, Tianlong Chen, Huan Liu, Chen-Yu Lee, Tomas Pfister
Arizona State University, Google Cloud AI Research, University of North Carolina at Chapel Hill
Annual Meeting of the Association for Computational Linguistics (2025)
Memory P13N Agent RL RAG

📝 Paper Summary

Memory organization Memory recall Conversational personalization
Reflective Memory Management (RMM) improves long-term dialogue by reorganizing history into topic-based summaries (future-looking) and refining retrieval via reinforcement learning based on generation citations (backward-looking).
Core Problem
Existing long-term memory systems rely on rigid granularities (turns/sessions) that fragment semantic topics and use fixed retrievers that fail to adapt to specific user patterns.
Why it matters:
  • Rigid boundaries (e.g., sessions) cut off context, leading to incomplete retrieval and hallucinations in personalized agents.
  • Fixed retrievers cannot adapt to diverse user interaction styles without expensive labeled data, limiting performance in specialized domains.
  • Current approaches struggle to balance comprehensive storage with precise retrieval, degrading response quality when irrelevant context is included.
Concrete Example: A user mentions a fever subsiding and a cough persisting today. To respond safely, the agent must recall an allergy to penicillin mentioned a week ago. Standard session-based retrieval might miss the allergy if it was buried in a different topic thread, causing the agent to suggest unsafe medication.
Key Novelty
Reflective Memory Management (RMM)
  • Prospective Reflection: Dynamically decomposes finished sessions into atomic 'topics' rather than raw turns, merging new info with existing memory banks to optimize future lookup.
  • Retrospective Reflection: Uses the LLM's own citations (did I use this retrieved memory?) as a reward signal to train a lightweight reranker via reinforcement learning, adapting retrieval without human labels.
Evaluation Highlights
  • +10% accuracy improvement over baselines without memory management on the LongMemEval dataset.
  • +5.9% METEOR score improvement over RAG baselines on the MSC dataset using the GTE retriever.
  • Achieves 70.4% accuracy on LongMemEval with GTE, outperforming fixed-granularity methods and specialized agents like MemoryBank and LD-Agent.
Breakthrough Assessment
8/10
Strong conceptual novelty in coupling topic-based granularity with self-supervised RL for retrieval. Significant empirical gains (+10%) make it a notable advancement in personalized memory.
×