← Back to Paper List

Communitykg-rag: Leveraging community structures in knowledge graphs for advanced retrieval-augmented generation in fact-checking

RC Chang, J Zhang
Department of Computer Science, University of California, Davis
arXiv, 8/2024 (2024)
RAG KG Factuality

📝 Paper Summary

Graph-based RAG pipeline Modularized RAG pipeline
CommunityKG-RAG enhances zero-shot fact-checking by constructing a knowledge graph from articles, detecting community structures to identify relevant subgraphs, and converting these communities into textual context for LLMs.
Core Problem
Existing RAG systems often struggle with multi-hop reasoning and context integration because they retrieve fragmented text chunks that lack structural relationships, while direct KG-based methods (feeding triples) confuse LLMs trained on sequential text.
Why it matters:
  • LLMs suffer from hallucinations and outdated training data, jeopardizing fact-checking accuracy.
  • Standard RAG struggles when crucial information is buried in long texts or when retrieved contexts contain noise/contradictions.
  • Directly feeding Knowledge Graph triples (subject, relation, object) to LLMs is suboptimal because models are not trained to leverage such structured formats effectively.
Concrete Example: When verifying a claim requiring multi-hop reasoning, a standard RAG system might retrieve disparate sentences that don't explicitly link entities. CommunityKG-RAG instead retrieves a 'community' of interconnected entities (e.g., a subgraph of related political figures and events) and converts this structural context into natural language, enabling the LLM to see the full picture.
Key Novelty
Community-Centric Knowledge Graph Retrieval
  • Constructs a Knowledge Graph from fact-checking articles and uses the Louvain algorithm to detect 'communities' (clusters of densely connected entities).
  • Retrieves entire communities based on semantic similarity to the claim, rather than just individual sentences or triples.
  • Converts the retrieved graph communities back into natural language sentences before feeding them to the LLM, bridging the gap between structured knowledge and sequential language processing.
Architecture
Architecture Figure Figure 2
Overview of the CommunityKG-RAG framework pipeline.
Evaluation Highlights
  • Outperforms the KAPING baseline by +3.45% in Accuracy on the MOCHEG dataset using Llama-2-7b.
  • Achieves higher accuracy (63.02%) compared to Semantic Retrieval (56.09%) and No Retrieval (51.13%) baselines.
  • Demonstrates that converting KG communities to sentences is superior to using raw triples, improving results significantly over triple-based methods.
Breakthrough Assessment
7/10
Novel integration of community detection in KGs for RAG. Effectively addresses the structure-vs-text gap in LLMs. Strong zero-shot performance, though evaluated on a single dataset type.
×