Virtual Context Management: A technique inspired by OS virtual memory that provides the illusion of extended context by swapping data between the LLM's prompt and external storage
Main Context: Analogous to RAM/Physical Memory; the actual prompt tokens fed to the LLM during inference (includes System Instructions, Working Context, and FIFO Queue)
External Context: Analogous to Disk Storage; out-of-context data stored in databases (Recall Storage, Archival Storage) that must be explicitly retrieved to be seen by the LLM
FIFO Queue: A rolling history of recent messages kept in the Main Context; messages evicted from here move to Recall Storage
Recall Storage: A database storing the entire history of messages (user inputs, agent outputs) that have been evicted from the active context window
Archival Storage: A read/write database for storing arbitrary length text objects or documents, searchable via vector similarity
Function Chaining: The ability of the LLM to execute multiple function calls sequentially (e.g., search page 1, then search page 2) before returning control/response to the user
System Instructions: Read-only static prompt section defining the agent's persona, memory hierarchy rules, and available function schemas
Working Context: A fixed-size read/write text block in Main Context for storing key facts, preferences, and immediate state information
CSIM: Cosine Similarity metric used to measure how well the agent's generated text aligns with a gold standard persona or embedding
DMR: Deep Memory Retrievalβa task evaluating an agent's ability to answer questions based on specific details from distant past conversations
Recursive Summary: A summary of evicted messages maintained at the start of the FIFO queue to retain high-level context of what has left the window
LLM Processor: The inference engine (e.g., GPT-4) that takes Main Context as input and generates completion tokens (function calls or text)