RAG: Retrieval-Augmented Generation—AI systems that answer questions by first searching for relevant documents
Community Report: A high-level textual summary of a cluster of nodes (community) in a graph, used to answer abstract questions
Retrieval Operator: An atomic function (e.g., 'Onehop', 'PPR') that selects specific graph elements (nodes, edges, chunks) based on a query
PPR: Personalized PageRank—an algorithm to find nodes relevant to a seed set by simulating random walks with restarts
Steiner Tree: A subgraph that connects a specific set of required nodes (seeds) with the minimum total edge weight
VGraphRAG: The new method proposed in this paper that combines entity linking with vector-based retrieval of communities and chunks
Abstract QA: Questions requiring high-level understanding or summarization of broad topics rather than specific factual lookups
TKG: Textual Knowledge Graph—a KG where entities and relationships have associated textual descriptions
Leiden algorithm: A community detection algorithm used to cluster nodes in the graph for hierarchical analysis
Map-Reduce: A generation strategy where an LLM processes retrieved contexts in parallel (Map) and then summarizes the results (Reduce)