← Back to Paper List

Way to Specialist: Closing Loop Between Specialized LLM and Evolving Domain Knowledge Graph

Y Zhang, L Chen, S Li, N Cao, Y Shi, J Ding, Z Qu…
Shanghai Jiao Tong University, Tongji University, Central South University, Huazhong University of Science and Technology
arXiv, 11/2024 (2024)
RAG KG QA

📝 Paper Summary

Graph-based RAG pipeline
WTS creates a closed loop where a domain knowledge graph improves LLM reasoning via RAG, while the LLM simultaneously updates and expands the graph using knowledge extracted from successfully answered questions.
Core Problem
Generalist LLMs lack specialized domain knowledge, but existing RAG solutions rely on static, often incomplete knowledge graphs or coarse general graphs (like Wikidata) that fail to support deep domain reasoning.
Why it matters:
  • Specialized domains (medical, legal) require high-precision knowledge that general models lack, but fine-tuning is expensive and data-hungry
  • Static knowledge graphs become outdated quickly and cannot adapt to new questions or evolving domain information
  • Current approaches use unidirectional 'KG-for-LLM' enhancement, missing the opportunity to use the LLM's own reasoning to improve the underlying knowledge base
Concrete Example: In a medical query about the 'auriculotemporal nerve', a standard RAG might fail if the specific relation 'encircles middle meningeal artery' is missing from the graph. WTS not only answers similar questions using available data but uses the answer to generate the triple {auriculotemporal nerve, encircle, middle meningeal artery}, adding it to the graph for future use.
Key Novelty
Way-to-Specialist (WTS) bidirectional 'LLM ⟳ KG' framework
  • Implements a 'DKG-Augmented LLM' that uses iterative retrieval and pruning over a domain knowledge graph (DKG) to prompt the LLM for answers
  • Implements 'LLM-Assisted DKG Evolution' where the LLM extracts new knowledge triples from answered questions to update the DKG, allowing the system to start with an empty graph and learn from experience
Architecture
Architecture Figure Figure 2
The complete WTS framework comprising two loops: DKG-Augmented LLM (Retrieval) and LLM-Assisted DKG Evolution (Update).
Evaluation Highlights
  • +11.3% accuracy improvement over SOTA baselines (specifically ToG) on specialized domain datasets
  • +126.9% accuracy gain on PubMedQA using GPT-4o compared to standard I/O prompting without RAG
  • Achieves superior performance in 4 out of 5 specialized domains (medical, natural science, social science, linguistics) compared to baselines like Chain-of-Thought and Think-on-Graph
Breakthrough Assessment
7/10
Strong conceptual novelty in closing the loop between RAG and KG construction without training. Demonstrates significant gains in specialized domains, though reliance on 'gold answers' for the apprenticeship phase limits fully autonomous deployment.
×