Mind-Map: A structured knowledge graph agent that stores reasoning context, clusters it, and allows the model to query past reasoning steps to maintain coherence
DeepSeek-R1: The base large reasoning model used as the primary LLM in this framework
GraphRAG: A method using knowledge graphs to structure and retrieve information, used here to build the Mind-Map
GPQA: A PhD-level multiple-choice science QA benchmark used to evaluate expert-level reasoning
GAIA: A benchmark for AI agents assessing reasoning, web browsing, and tool-use proficiency
Humanity's Last Exam: A difficult benchmark assessing AI performance across a broad range of expert subjects
Cohere Rerank: A commercial reranking model used to filter and order search results based on relevance
ROUGE: A set of metrics used to evaluate automatic summarization and machine translation by comparing to human references
SOTA: State-of-the-Artโthe current best performance achievable by any known method