← Back to Paper List

Diversify-verify-adapt: Efficient and Robust Retrieval-Augmented Ambiguous Question Answering

Y In, S Kim, RA Rossi, MM Tanjim, T Yu, R Sinha…
Korea Advanced Institute of Science and Technology, Adobe Research
arXiv, 9/2024 (2024)
RAG QA Reasoning

📝 Paper Summary

Modularized RAG pipeline Ambiguous Question Answering
Diva improves ambiguous question answering by diversifying retrieval via pseudo-interpretations and adaptively choosing between RAG and closed-book generation based on retrieval quality verification.
Core Problem
Single-step RAG often fails to retrieve passages covering all interpretations of an ambiguous question (low recall), while Iterative RAG (like ToC) is computationally expensive and slow.
Why it matters:
  • Over 50% of search queries are ambiguous, requiring systems to cover multiple valid user intents rather than a single answer
  • Current iterative methods require nearly 5.5 exploration steps per query, drastically increasing latency and API costs
  • RAG systems suffer significant factual accuracy degradation when retrieved passages contain noise or irrelevant information
Concrete Example: For the question 'Who played the Weasley brothers in Harry Potter?', a standard retriever might only find information about Ron Weasley, missing other brothers like Percy. Iterative approaches eventually find them but take too long. Diva infers 'Who played Ron?', 'Who played Percy?' upfront to retrieve all at once.
Key Novelty
Diversify-Verify-Adapt (Diva)
  • **Retrieval Diversification (RD):** mimics human reasoning to infer 'pseudo-interpretations' of an ambiguous question upfront, using them to retrieve a diverse set of passages in a single step rather than iteratively.
  • **Retrieval Verification (RV):** defines a new quality criterion (Useful, PartialUseful, Useless) for ambiguous QA and uses an LLM to grade whether retrieved passages cover the inferred interpretations.
  • **Adaptive Generation (AG):** dynamically selects the best strategy: use RAG for useful/partial passages, or fall back to the LLM's internal knowledge (closed-book) if retrieval is deemed 'Useless'.
Architecture
Architecture Figure Figure 2
Comparison of Vanilla RAG, Iterative RAG (ToC), and the proposed Diva framework architectures.
Evaluation Highlights
  • Outperforms state-of-the-art Iterative RAG (ToC) by +1.9 D-F1 on ASQA (Ambiguous QA benchmark) using GPT-3.5.
  • Achieves ~3x faster inference speed compared to Iterative RAG (ToC) while maintaining superior accuracy.
  • Reduces input token consumption by >50% compared to Iterative RAG methods.
Breakthrough Assessment
7/10
Strong practical contribution addressing the latency/cost bottleneck of Iterative RAG while improving accuracy. The 'verify and adapt' mechanism effectively handles retrieval failure, a common RAG pain point.
×