← Back to Paper List

Connecting the Knowledge Dots: Retrieval-augmented Knowledge Connection for Commonsense Reasoning

J Kim, S Bak, M Lee, M Hong, S Kim, TE Kam, SK Lee
Korea University
Proceedings of the 2025 …, 2025 (2025)
RAG Reasoning QA

📝 Paper Summary

Modularized RAG pipeline Commonsense Reasoning
RECONNECT transforms indirectly relevant retrieved documents into direct, question-specific explanations by extracting knowledge from diverse document subsets and aggregating them before inference.
Core Problem
Commonsense reasoning requires implicit knowledge rarely stated explicitly in text, causing standard retrieval to return documents that are only indirectly relevant and lack the direct information needed to answer the question.
Why it matters:
  • LLMs struggle with commonsense reasoning because the necessary knowledge is implicit and not directly represented in the question text
  • Existing Retrieval-Augmented Language Models (RALMs) often retrieve documents that do not directly contain the answer, leading to a gap between retrieved context and useful reasoning
  • Finite knowledge bases may not cover the specific direct information required, limiting generalizability on out-of-domain tasks
Concrete Example: Question: 'What happens when a crumpled and flat sheet of paper drop?' Standard retrieval finds facts about 'crumpling paper' or 'dropping items' generally. RECONNECT synthesizes disparate facts (one doc mentions air resistance depends on shape, another says crumpled paper has less resistance) into a direct explanation: 'The crumpled paper has less air resistance, so its greater net force makes it fall faster.'
Key Novelty
Retrieval-augmented knowledge Connection (RECONNECT)
  • Explanation-guided retrieval: Expands the original query into a detailed explanation to retrieve contextually aligned documents rather than just keyword matching
  • Relevance-based document sampling: Stochastically selects document subsets that balance relevance to the question and diversity among documents to capture multiple perspectives
  • Knowledge connection: Extracts relevant knowledge from these diverse subsets and aggregates them into a single, coherent, direct explanation used for final inference
Architecture
Architecture Figure Figure 2
The RECONNECT pipeline: Query Expansion → Retrieval → Relevance-based Subset Sampling → Knowledge Extraction from subsets → Aggregation into a direct explanation → Answer Prediction.
Evaluation Highlights
  • +2.0% average accuracy improvement over SOTA baseline (ZEBRA) on 8 in-domain commonsense benchmarks using Llama 3.1-8B Instruct
  • +4.6% average accuracy improvement over SOTA baseline (ZEBRA) on 8 out-of-domain benchmarks, demonstrating strong generalization
  • Outperforms supervised knowledge generation methods (like COCONUT) without requiring additional fine-tuning of a specific generation model
Breakthrough Assessment
8/10
Significant gains on both ID and OOD tasks by addressing the specific semantic gap in commonsense retrieval. The shift from simple retrieval to 'retrieval-then-synthesis' is a strong methodological contribution.
×