Parametric Memory: Knowledge implicitly stored within the model's neural network weights, acquired during training
Contextual Memory: Explicit external information (text, databases, KV cache) provided to the model during inference
KV Cache: Key-Value cache—temporary storage of intermediate token representations during inference to speed up generation (short-term memory)
Consolidation: The process of transforming short-term experiences/observations into persistent long-term storage
RCI: Relative Citation Index—a metric used in this survey that normalizes citation counts by publication age to compare impact across different years
RAG: Retrieval-Augmented Generation—AI systems that answer questions by first searching for relevant documents
Episodic Memory: Storage of temporally anchored experiences, such as dialogue histories and event sequences
Semantic Memory: Storage of facts and general knowledge, often in knowledge graphs or model parameters
Procedural Memory: Memory of how to perform tasks or use tools, often implicit in trained weights or explicit in stored trajectories
Working Memory: A dynamic control mechanism integrating short-term caches and activated long-term knowledge for real-time reasoning
KV Cache Eviction: Techniques to selectively remove less important tokens from the KV cache to manage memory footprint in long-context tasks