TA-Mem: Tool-Augmented Autonomous Memory—the proposed framework using agents to select retrieval tools.
LoCoMo: Long Context Memory dataset—a benchmark for evaluating very long-term conversational memory in agents.
Episodic Memory: Memory of specific events, experiences, and their temporal context (who, what, when, where).
Semantic Chunking: Splitting text into segments based on shifts in meaning or topic, rather than arbitrary token counts.
F1 score: In this context, a metric measuring the overlap of tokens between the predicted answer and the ground truth.
BLEU-1: A precision-based metric measuring the unigram (single word) overlap between generated text and reference text.
Agentic Loop: A cyclic process where an AI agent observes an environment, reasons, selects an action (tool), and repeats until a stopping condition is met.