RAG: Retrieval-Augmented Generation—AI systems that answer questions by first searching for relevant documents
FAISS: Facebook AI Similarity Search—a library for efficient similarity search and clustering of dense vectors
Semantic Cache: A cache that stores key-value pairs where keys are vector embeddings, allowing retrieval based on semantic similarity rather than exact string matching
HNSW: Hierarchical Navigable Small World—a graph-based algorithm used for approximate nearest neighbor search in vector databases
TTL: Time To Live—a mechanism that limits the lifespan of data in a computer or network
Qdrant: A vector database engine used for storing and searching vector embeddings
LRU: Least Recently Used—a cache replacement policy that discards the least recently used items first
Cold start: The initial state of the system where the cache is empty, resulting in higher latency for the first few interactions
System 1 / System 2: A cognitive framework where System 1 is fast/instinctive and System 2 is slow/deliberative; here applied to foreground vs. background processing agents