โ† Back to Paper List

Hypergraphrag: Retrieval-augmented generation via hypergraph-structured knowledge representation

Haoran Luo, Haihong E, Guanting Chen, Yandan Zheng, Xiaobao Wu, Yikai Guo, Qika Lin, Yu Feng, Zemin Kuang, Meina Song, Yifan Zhu, Luu Anh Tuan
Beijing University of Posts and Telecommunications, Nanyang Technological University, Beijing Institute of Computer Technology and Application, National University of Singapore
arXiv preprint arXiv โ€ฆ (2025)
RAG KG QA

๐Ÿ“ Paper Summary

Graph-based RAG pipeline Knowledge Graph Construction
HyperGraphRAG models complex knowledge as a hypergraph where edges connect multiple entities simultaneously, enabling the retrieval of complete n-ary facts rather than fragmented binary triples found in standard graph RAG.
Core Problem
Existing graph-based RAG methods rely on ordinary graphs where edges connect only two entities (binary relations), causing information loss when modeling complex real-world facts involving three or more entities.
Why it matters:
  • Real-world knowledge (especially in medicine and law) often involves n-ary relations (e.g., Patient + Symptom + Test Result -> Diagnosis) that cannot be naturally represented by simple binary links.
  • Decomposing complex facts into binary triples leads to representation sparsity and fragmentation, forcing the retrieval system to hop multiple times to reconstruct a single coherent fact.
  • Standard chunk-based RAG misses structural connections entirely, while current GraphRAG approaches over-simplify these connections.
Concrete Example: A medical fact like 'Male hypertensive patients with serum creatinine 115โ€“133ยตmol/L are diagnosed with mild elevation' involves 4 entities. A binary graph splits this into separate triples like (Patient, Male) and (Patient, Mild elevation), losing the specific condition that links all four.
Key Novelty
Hypergraph-structured Knowledge Representation for RAG
  • Utilizes hyperedges that connect $n$ entities ($n \ge 2$) simultaneously, preserving the integrity of complex facts instead of breaking them into binary triples.
  • Stores the hypergraph in a bipartite graph database (connecting hyperedge nodes to entity nodes) to allow efficient retrieval while maintaining higher-order structural information.
  • Employes a 'hyperedge retrieval' strategy that matches queries directly to these complex n-ary facts, fused with standard text chunks for generation.
Architecture
Architecture Figure Figure 3
The HyperGraphRAG pipeline: Construction, Retrieval, and Generation.
Evaluation Highlights
  • +7.45 F1 improvement over StandardRAG across five domains (Medicine, Agriculture, CS, Legal, Mixed).
  • Outperforms GraphRAG (binary graph baseline) by +5.9 F1 score on average across domains.
  • Achieves highest Correctness score (64.8) in LLM-as-a-judge evaluation, surpassing both LightRAG (60.9) and GraphRAG (59.6).
Breakthrough Assessment
8/10
Addresses a fundamental limitation of graph RAG (binary constraint) with a mathematically sound hypergraph approach. Strong empirical gains across diverse domains validate the method's effectiveness.
×