From "What to Eat?" to Perfect Recipe: ChefMind's Chain-of-Exploration for Ambiguous User Intent in Recipe Recommendation

📝 Paper Summary

Modularized RAG pipeline Personalized Recommendation

ChefMind is a hybrid recipe recommendation system that uses a Chain of Exploration to refine ambiguous queries into structured conditions, combining Knowledge Graphs for semantic accuracy and RAG for contextual details.

Core Problem

Personalized recipe recommendation struggles with fuzzy user intent (e.g., 'healthy comfort food'), lack of semantic accuracy, and insufficient detail coverage when using isolated technologies.

Why it matters:

LLMs alone suffer from hallucinations, inventing non-existent recipes or unsafe cooking instructions
Knowledge Graphs lack adaptability to dynamic, unstructured user queries despite their semantic precision
RAG relies heavily on retrieval quality and often fails to capture structured constraints like dietary restrictions

Concrete Example: A user asks for 'something healthy and home-style'. A keyword search might fail to find matches. An LLM might hallucinate a dish. ChefMind's CoE module refines this into structured conditions (Health=True, Tag=Home-style) for the KG, while RAG fetches specific cooking tips.

Key Novelty

Chain of Exploration (CoE) + Hybrid Retrieval (KG + RAG)

Introduces a 'Chain of Exploration' (CoE) module acting as an intelligent frontend that progressively refines ambiguous queries into structured database constraints
Integrates structured semantic reasoning (Knowledge Graph) with unstructured context retrieval (RAG) in a unified loop, where the KG handles hard constraints and RAG provides rich details

Architecture

The overall framework of ChefMind, illustrating the flow from User to CoE, then branching to KG and RAG, and converging at the LLM

Evaluation Highlights

Achieves an average score of 8.7/10 across accuracy, relevance, completeness, and clarity, outperforming LLM+RAG (6.7) and LLM+KG (6.4)
Reduces unprocessed queries to 1.6% (2 queries), significantly lower than LLM+KG (25.6%) and LLM+RAG (17.1%)
Demonstrates superior robustness in handling fuzzy demands, with only 1 unprocessed query in challenging batches where baselines failed on 4-5 queries

Breakthrough Assessment

7/10

Solid engineering integration of CoE, KG, and RAG for a specific domain. While the components are known, the specific hybrid architecture effectively solves the 'fuzzy intent' problem in recommendation, showing strong empirical gains.

⚙️ Technical Details

Problem Definition

Setting: Recipe recommendation given user queries (explicit or ambiguous)

Inputs: Natural language user query Q

Outputs: Natural language recommendation R_final containing recipe names, reasons, and cooking details

Pipeline Flow

Input Processing: Chain of Exploration (CoE) parses query
Retrieval & Selection: Knowledge Graph (KG) retrieval + RAG vector retrieval
Generation: LLM Integration

System Modules

Chain of Exploration (CoE)

Parse fuzzy demands into structured conditions; acts as entry point

Model or implementation: Rule-based progressive search logic

Knowledge Graph (KG) (Retrieval & Selection)

Retrieve candidate recipes based on structured semantic constraints

Model or implementation: Neo4j Graph Database

RAG Module (Retrieval & Selection)

Retrieve unstructured details (steps, tips) via vector similarity

Model or implementation: Milvus Vector Database (768-dim embeddings)

LLM Integrator

Integrate structured KG results and RAG details into natural language response

Model or implementation: DeepSeek

Novel Architectural Elements

Conditional workflow switching: CoE dynamically routes fuzzy vs. clear demands
Hybrid retrieval binding: KG provides the 'skeleton' (candidates) while RAG provides the 'flesh' (details), explicitly combined by the LLM

Modeling

Base Model: DeepSeek

Compute: Not reported in the paper

Comparison to Prior Work

vs. LLM+KG: ChefMind adds RAG for unstructured details and CoE for query refinement
vs. LLM+RAG: ChefMind adds KG for structured constraint handling and CoE for intent understanding

Limitations

Relies on the quality of the constructed Knowledge Graph; incomplete graph data limits CoE effectiveness
Performance depends on the 'Xiachufang' dataset specifics; cross-cultural generalization not tested
No latency metrics reported for the multi-stage pipeline (CoE -> KG/RAG -> LLM)
Specific details on the CoE refinement logic (rules vs. model-based) are somewhat abstract

Reproducibility

No replication artifacts mentioned in the paper (no code URL, no data release). Dataset used is 'Xiachufang' but access details are not provided.

📊 Experiments & Results

Evaluation Setup

Evaluation on Xiachufang dataset using human-annotated queries (explicit and fuzzy)

Benchmarks:

Xiachufang Dataset (Custom Test Set) (Recipe Recommendation) [New]

Metrics:

Accuracy (1-10)
Relevance (1-10)
Completeness (1-10)
Clarity (1-10)
Unprocessed Queries (%)
Statistical methodology: Not explicitly reported in the paper

Key Results

Benchmark	Metric	Baseline	This Paper	Δ
ChefMind significantly outperforms ablation baselines in overall quality scores.
Xiachufang Custom Test Set	Average Score (1-10)	6.7	8.7	+2.0
ChefMind demonstrates far superior robustness, successfully handling nearly all queries.
Xiachufang Custom Test Set	Unprocessed Queries Rate	25.6%	1.6%	-24.0%
Xiachufang Custom Test Set	Unprocessed Queries Count	22	2	-20

Main Takeaways

ChefMind achieves superior performance in accuracy, relevance, completeness, and clarity compared to KG-only or RAG-only baselines
The CoE module is critical for handling fuzzy/ambiguous queries, reducing failure rates from ~17-25% to <2%
Combining structured (KG) and unstructured (RAG) retrieval provides a more complete recipe recommendation than either method alone

📚 Prerequisite Knowledge

Prerequisites

Knowledge Graphs (structure and traversal)
Retrieval-Augmented Generation (RAG) concepts
Vector similarity search

Key Terms

CoE: Chain of Exploration—a module that dynamically refines ambiguous user queries into structured search conditions through a multi-step logic

KG: Knowledge Graph—a structured database storing entities (recipes, ingredients) and their relationships for semantic reasoning

RAG: Retrieval-Augmented Generation—fetching relevant unstructured text chunks (cooking steps, tips) to ground LLM generation

Neo4j: A graph database used to store the structured recipe knowledge graph

Milvus: A vector database used to store and retrieve dense vector embeddings of recipe text for RAG

fuzzy demand: User queries that are ambiguous, abstract (e.g., 'healthy'), or very short (<5 chars), requiring refinement before database lookup

DeepSeek: The specific Large Language Model used as the generator and integrator in this system