Token-level Memory: Memory stored as discrete, inspectable units (text, JSON, visual tokens) external to model parameters
Parametric Memory: Memory encoded implicitly within the model's neural network weights, often updated via fine-tuning or gradients
Latent Memory: Memory represented as continuous vector hidden states or activations that persist across inference steps
Factual Memory: Storage of declarative knowledge about users (profiles) and the environment (world state)
Experiential Memory: Storage of procedural knowledge, such as past successful plans, failures, or distilled skills
Working Memory: Temporary workspace for managing information relevant to the current active task or reasoning chain
RAG: Retrieval-Augmented Generation—typically using static external databases to ground generation, distinct from self-evolving agent memory
Context Engineering: Optimizing the information payload within the LLM's finite context window (resource management), distinct from the cognitive scope of memory
KV Cache: Key-Value Cache—storage of pre-computed attention representations to speed up generation, often confused with agent memory
Graph RAG: Structuring knowledge as a graph (nodes/edges) to enable relational retrieval, used in both RAG and agent memory systems
MCP: Model Context Protocol—a standard for connecting AI assistants to data systems, falling under context engineering