← Back to Paper List

Open Deep Search: Democratizing Search with Open-source Reasoning Agents

Salaheddin Alzubi, Creston Brooks, Purva Chiniya, Edoardo Contente, Chiara von Gerlach, Lucas Irwin, Yihan Jiang, Arda Kaz, Windsor Nguyen, Sewoong Oh, Himanshu Tyagi, P. Viswanath
arXiv.org (2025)
RAG Agent Reasoning Benchmark

📝 Paper Summary

Agentic RAG pipeline Search AI
Open Deep Search (ODS) augments open-source LLMs with a sophisticated search tool and reasoning agents (ReAct or CodeAct) to outperform proprietary search engines like Perplexity and GPT-4o.
Core Problem
State-of-the-art Search AI solutions (Perplexity, GPT-4o Search) are closed-source, while existing open-source alternatives primarily pass raw search results to LLMs without sufficient reasoning or processing.
Why it matters:
  • Closed-source solutions limit transparency, innovation, and community development in Search AI
  • Proprietary models dominate benchmarks, creating a gap between accessible open-source tools and commercial performance
  • Simple retrieval-augmented generation often fails on complex queries requiring multi-step reasoning or precise calculations
Concrete Example: On a FRAMES benchmark question asking to convert 112 inches to mm, Perplexity's Sonar Reasoning Pro fails (answering 2,858mm). ODS correctly identifies 112 inches and uses the Wolfram Alpha tool to calculate the exact conversion to 2,845mm.
Key Novelty
Open Deep Search (ODS) Framework
  • Combines an 'Open Search Tool' (which rephrases queries, scrapes, and reranks content) with 'Open Reasoning Agents' (ReAct or CodeAct) that orchestrate tool usage
  • Integrates Chain-of-Thought Self-Consistency and dynamic few-shot prompting to enhance the reasoning reliability of open-source base models like DeepSeek-R1
Evaluation Highlights
  • +9.7% accuracy improvement on the FRAMES benchmark using ODS-v2+DeepSeek-R1 compared to GPT-4o Search Preview
  • 88.3% accuracy on SimpleQA with ODS-v2+DeepSeek-R1, surpassing Perplexity Sonar Reasoning Pro (82.2%)
  • ODS-v1+DeepSeek-R1 achieves 69.8% on FRAMES, outperforming Perplexity Sonar Reasoning Pro (64.5%)
Breakthrough Assessment
8/10
Demonstrates that open-source agents using open models can surpass leading proprietary search products (GPT-4o, Perplexity) on difficult benchmarks, democratizing high-end search AI capabilities.
×