← Back to Paper List

Memory in the Age of AI Agents

Yuyang Hu, Shichun Liu, Yanwei Yue, Guibin Zhang, Boyang Liu, Fangyi Zhu, Jiahang Lin, Honglin Guo, Shihan Dou, Zhiheng Xi, Senjie Jin, Jiejun Tan, Yanbin Yin, Jiongnan Liu, Zeyu Zhang, ZhongXiang Sun, Yutao Zhu, Hao Sun, Boci Peng, Zhenrong Cheng, Xuanbo Fan, Jiaxin Guo, Xinlei Yu, Zhen Zhou, Zewen Hu, Jiahao Huo, Junhao Wang, Yu Niu, Yu Wang, Zhe Yin, et al.
National University of Singapore, Renmin University of China, Fudan University, Peking University, Nanyang Technological University, Tongji University, University of California San Diego, Hong Kong University of Science and Technology (Guangzhou), Griffith University, Georgia Institute of Technology, OPPO, Oxford University
arXiv.org (2025)
Memory Agent RAG Benchmark RL MM

📝 Paper Summary

Agent memory architecture Long-term memory for LLMs Agentic RAG
The survey unifies fragmented research on agent memory into a single taxonomy—Forms (structure), Functions (utility), and Dynamics (evolution)—distinguishing it from RAG and context engineering while identifying future frontiers.
Core Problem
Research on agent memory is fragmented with inconsistent terminology, making it difficult to distinguish true agentic memory from related concepts like RAG or simple context window management.
Why it matters:
  • Current definitions conflate 'LLM memory' (context caching) with 'Agent memory' (persistent, evolving cognitive state), hindering clarity.
  • Traditional 'long-term/short-term' taxonomies fail to capture the complexity of modern agents that need to evolve, forget, and consolidate experience over long horizons.
  • Developers lack a unified framework to design agents that can maintain identity and learn skills across varying tasks without starting from scratch.
Concrete Example: Early systems like MemoryBank framed their contributions as 'LLM memory', but they were actually addressing agentic challenges like tracking user preferences across days. Without a clear taxonomy, a researcher might confuse architectural caching (like Mamba) with cognitive memory (like a user profile), leading to misaligned system designs.
Key Novelty
Unified Forms-Functions-Dynamics Taxonomy
  • Forms: Classifies memory into Token-level (discrete text/visual units), Parametric (encoded in model weights), and Latent (hidden states/activations).
  • Functions: Distinguishes Factual (knowledge), Experiential (skills/history), and Working memory (current task workspace) rather than just temporal duration.
  • Dynamics: Models memory not as static storage but as a lifecycle of Formation (creation), Evolution (consolidation/forgetting), and Retrieval (access).
Evaluation Highlights
  • No quantitative evaluation results reported in the paper (this is a survey paper).
  • Compiles a list of key benchmarks including LoCoMo, LongMemEval, and GAIA.
  • Summarizes open-source frameworks like Memary, MemOS, and Mem0.
Breakthrough Assessment
9/10
Provides a crucial, clarifying taxonomy for a rapidly saturating field. By rigorously distinguishing Agent Memory from RAG and Context Engineering, it sets the standard for future definitions.
×