Deep Research (DR): An agentic workflow where LLMs autonomously plan, retrieve, reason, and synthesize information to solve complex, open-ended problems beyond simple Q&A.
RAG: Retrieval-Augmented Generation—AI systems that answer questions by retrieving documents from a static corpus before generating a response.
MCTS: Monte Carlo Tree Search—a heuristic search algorithm used for decision processes, here applied to exploring reasoning paths in query planning.
SFT: Supervised Fine-Tuning—training a model on labeled examples to adapt it to specific tasks like tool use or query decomposition.
GRPO: Group Relative Policy Optimization—a reinforcement learning algorithm used to optimize policies based on group-wise comparisons of outputs.
PPO: Proximal Policy Optimization—a standard reinforcement learning algorithm for training agents.
CoT: Chain-of-Thought—a prompting technique where models generate intermediate reasoning steps before the final answer.
SPLADE: Sparse Lexical and Expansion Model—a neural retrieval method that learns sparse representations for efficient keyword-based search.
ColBERT: Contextualized Late Interaction over BERT—a dense retrieval model that matches token-level embeddings.
LayoutLM: A document understanding model that incorporates text layout and visual information.
Multimodal Retrieval: Retrieval systems that index and search across text, images, charts, and tables simultaneously.