← Back to Paper List

Think-on-Graph 2.0: Deep and Interpretable Large Language Model Reasoning with Knowledge Graph-guided Retrieval

S Ma, C Xu, X Jiang, M Li, H Qu, J Guo
International Digital Economy Academy
arXiv, 7/2024 (2024)
RAG KG Reasoning QA

πŸ“ Paper Summary

Graph-based RAG pipeline Hybrid RAG (Text + Knowledge Graph)
ToG-2 is a hybrid RAG framework that iteratively alternates between graph-based relation exploration and document-based context verification to achieve deep, faithful reasoning for complex questions.
Core Problem
Current RAG methods struggle with complex reasoning because vector retrieval misses structural links between entities, while Knowledge Graphs (KGs) lack detailed context due to incompleteness.
Why it matters:
  • Vector-based RAG often retrieves superficially similar texts but misses deep logical connections needed for multi-hop reasoning
  • Existing hybrid approaches loosely couple KG and text (e.g., just aggregating results), failing to use one source to guide deeper exploration in the other
  • LLMs hallucinate or fail to maintain reasoning trajectories when integrating fragmented information without a structured roadmap
Concrete Example: For the question 'What are the competition records of the athlete born in the same place as Craig Virgin?', vector RAG might retrieve generic bios for Craig Virgin but miss the link to 'Lebanon, Illinois' and subsequent athletes. Pure KG RAG might find the birth place but lack the specific 'competition records' text for the linked athlete (e.g., Lukas Verzbicas) due to graph incompleteness.
Key Novelty
Tight-Coupling Hybrid RAG (KG Γ— Text)
  • Uses Knowledge Graphs as a navigation map to guide document retrieval: KG relations identify candidate entities that might contain answers, preventing aimless vector search
  • Uses Documents to prune the Knowledge Graph: Textual context is used to verify which KG entities are actually relevant to the specific query, filtering out irrelevant graph paths
  • Iterative 'Think-on-Graph' loop: Alternates between expanding search on the graph and verifying deeper clues in text until sufficient information is found
Architecture
Architecture Figure Figure 1(d)
Conceptual workflow of ToG-2 compared to other RAG paradigms. Shows the iterative cycle of extracting topic entities, searching the KG, retrieving text, and updating topic entities.
Evaluation Highlights
  • Achieves SOTA performance on 6 out of 7 knowledge-intensive datasets (e.g., +15.8% accuracy on MuSiQue) using GPT-3.5
  • Elevates smaller models (Llama-2-13B) to outperform GPT-3.5's direct reasoning capabilities on complex QA tasks
  • Reduces hallucination by grounding answers in iteratively verified chains of evidence from both structured (KG) and unstructured (Text) sources
Breakthrough Assessment
8/10
Strong methodological contribution by tightly coupling KG and Text retrieval rather than just merging them. Demonstrates significant gains on complex reasoning benchmarks and offers a training-free plug-and-play solution.
×