← Back to Paper List

Symphony: Towards Trustworthy Question Answering and Verification using RAG over Multimodal Data Lakes

Unknown authors
Hong Kong University of Science and Technology (Guangzhou), Renmin University of China, University of Arizona, Massachusetts Institute of Technology, Google
RAG MM QA Factuality Reasoning

📝 Paper Summary

Modularized RAG pipeline Multimodal Data Lakes
Symphony is a multimodal RAG system that decomposes complex questions for reasoning and employs a loosely coupled verification module to cross-check answers against private or public data lakes.
Core Problem
LLMs often hallucinate inaccurate information, especially when dealing with complex queries over multimodal data lakes where factual correctness is critical for decision-making.
Why it matters:
  • In 2023, chatbots were estimated to hallucinate 27% of the time, with factual errors in 46% of generated texts, undermining trust in high-stakes applications.
  • Existing solutions focus on alignment or prompt engineering but lack robust, explicit verification mechanisms against reliable external data sources (like private enterprise data lakes).
  • Complex questions often require aggregating information from multiple heterogeneous sources (tables, text, images), which standard single-step retrieval often fails to handle correctly.
Concrete Example: A user asks about a film's cast based on a Wikipedia table. The LLM might hallucinate that 'Meagan Good' did not appear in 'Stomp the Yard'. Symphony retrieves the specific cast table, identifies 'Meagan Good' in the 'April Palmer' role row, and refutes the LLM's claim with evidence.
Key Novelty
Decompose-Reason-Verify Framework for Multimodal RAG
  • Separates the Reasoning process (generating an answer via question decomposition and tool use) from the Verification process (checking that answer against data lakes).
  • Uses an iterative, prompt-based decomposition strategy where an LLM breaks complex queries into sub-questions targeting specific data items (tables or text).
  • Introduces a verification module that treats the generated answer as a hypothesis, retrieving supporting/refuting evidence from potentially different (private) data lakes to validate it.
Evaluation Highlights
  • On a multimodal data lake of 400K tables and 6M passages, Symphony achieves 77.8% Recall@20 for retrieving relevant data items.
  • In a verification task using TabFact, the task-specific PASTA model achieves 89% accuracy when relevant tables are retrieved, outperforming GPT-3.5 (75%).
  • Symphony's decomposition strategy successfully generates useful sub-queries for 77.8% of test cases (score of 2/2 by human evaluation).
Breakthrough Assessment
6/10
Proposes a solid architecture for trustworthy RAG with verification. While the components (decomposition, retrieval, verification) are known, integrating them into a unified multimodal system is valuable. Evaluation is preliminary (small sample sizes).
×