Sensory Memory: In humans, fleeting capture of senses; in LLMs, corresponds to immediate input requests or prompts
Short-Term Memory (STM): In LLMs, the processing of tokens within the immediate context window via attention mechanisms
Long-Term Memory (LTM): Persistent storage in LLMs implemented via external databases, vector stores, or graph structures
Episodic Memory: Memory of specific personal events and experiences (e.g., past user interactions)
Semantic Memory: Storage of general factual knowledge and concepts (e.g., facts from training data)
Procedural Memory: Implicit memory for skills and automated tasks (e.g., 'instincts' or learned behaviors in agents)
RAG: Retrieval-Augmented Generation—AI systems that answer questions by first searching for relevant documents
Context Window: The range of tokens (text) a model can process in a single interaction (e.g., 128k tokens for GPT-4o)
Graph-RAG: Retrieval-Augmented Generation that utilizes graph-based structures to improve retrieval accuracy and scalability
KV cache: Key-Value cache used in Transformers to store attention calculations, acting as a form of memory during generation