← Back to Paper List

TRACE the Evidence: Constructing Knowledge-Grounded Reasoning Chains for Retrieval-Augmented Generation

(Glasgow) Jinyuan Fang, Zaiqiao Meng, Craig Macdonald
University of Glasgow
arXiv, 6/2024 (2024)
RAG KG Reasoning QA

📝 Paper Summary

Graph-based RAG pipeline Modularized RAG pipeline
TRACE improves multi-hop RAG by converting retrieved documents into a knowledge graph and autoregressively constructing reasoning chains of triples to identify supporting evidence.
Core Problem
Retrievers in RAG often return irrelevant documents that introduce noise, degrading performance on multi-hop questions requiring multi-step reasoning.
Why it matters:
  • Irrelevant documents in the retrieved set can mislead the reader model, causing hallucinations or incorrect answers
  • Multi-hop questions require connecting dispersed pieces of evidence, which standard RAG struggles to do when evidence is buried in noisy documents
  • Simply prepending all retrieved documents to the prompt often results in suboptimal performance due to the 'lost-in-the-middle' phenomenon
Concrete Example: For the question 'When was the father of Albert Einstein born?', a standard RAG might retrieve documents about Albert's physics theories. TRACE extracts the triple (Albert Einstein, father, Hermann Einstein) and then links it to (Hermann Einstein, date of birth, 3 July 1814) to answer correctly.
Key Novelty
Knowledge-Grounded Reasoning Chains (TRACE)
  • Converts unstructured documents into a structured Knowledge Graph (KG) of triples to granularly separate relevant facts from noise
  • Constructs reasoning chains autoregressively: selects a triple from the KG, then selects the next triple based on the question and previous triples, mimicking human step-by-step reasoning
Architecture
Architecture Figure Figure 2
Overview of the TRACE framework, illustrating the pipeline from documents to answer.
Evaluation Highlights
  • +14.03% average improvement in Exact Match (EM) over standard RAG (using all retrieved documents) across three multi-hop QA datasets
  • Using only the reasoning chains (KG triples) as context is often sufficient, outperforming the use of full documents by reducing noise
  • Adaptive chain termination strategy significantly improves performance compared to fixed-length chains
Breakthrough Assessment
7/10
Strong empirical gains on multi-hop QA by integrating KG construction into RAG. The approach effectively addresses noise in retrieval, though reliance on LLMs for KG generation might be computationally heavy.
×