Sensory Memory: A cognitive-inspired buffer that rapidly filters irrelevant tokens from raw input via compression before they enter short-term memory
Sleep-time Update: An offline mechanism where the system reorganizes and consolidates memory entries during idle periods, decoupling heavy maintenance from real-time inference
Soft Update: A fast, temporary insertion of new memory entries with timestamps during inference, ensuring immediate availability without triggering expensive re-indexing
LLMLingua-2: A token classification model used to identify and retain only essential tokens for compression
Topic Segmentation: Dividing dialogue history into chunks based on semantic shifts rather than fixed sizes to preserve context integrity
STM: Short-Term Memory—a temporary buffer for recent topic-based segments
LTM: Long-Term Memory—persistent storage for consolidated memory entries