← Back to Paper List

From Prompting to Alignment: A Generative Framework for Query Recommendation

Erxue Min, Hsiu-Yuan Huang, Xihong Yang, Min Yang, Xin Jia, Yunfang Wu, Hengyi Cai, Junfeng Wang, Shuaiqiang Wang, Dawei Yin
Baidu Inc., Peking University, Chinese Academy of Sciences
arXiv (2025)
Recommendation RL RAG P13N

📝 Paper Summary

Query Recommendation Generative Search
GQR unifies query recommendation tasks into a single generative framework that aligns LLMs with user click preferences via a CTR-based reward model and grounds generation in retrieved co-occurrence patterns.
Core Problem
Traditional query recommendation relies on sparse historical logs, failing on cold-start queries, while existing LLM approaches generate semantically plausible but often unclickable queries due to a lack of alignment with real user feedback.
Why it matters:
  • Sparsity in historical interactions makes conventional methods ineffective for long-tail or new queries
  • Existing solutions are siloed (separate models for suggestion vs. completion), limiting generalization to new contexts like conversational search
  • Without aligning to click signals, LLMs may produce hallucinations or irrelevant suggestions that degrade user experience in commercial search engines
Concrete Example: In a cold-start scenario where a user types a novel query, a log-based system returns nothing due to zero co-occurrence data. A standard LLM might generate a grammatically correct but irrelevant query based on internal knowledge. GQR aims to generate a query that is both semantically relevant and statistically likely to be clicked.
Key Novelty
Generative Query Recommendation (GQR) with CTR-Alignment
  • Unifies diverse tasks (suggestion, completion, clarification) under one generative prompt template rather than separate specialized models
  • Treats the Click-Through Rate (CTR) predictor as a Process Reward Model (PRM) to guide the LLM via Direct Preference Optimization (DPO), ensuring outputs match user preferences
  • Augments the LLM with 'User Initiative' by retrieving co-occurrence queries as side information, bridging the gap between the model's internal knowledge and proactive search patterns
Architecture
Architecture Figure Figure 4
The overall learning framework of GQR, illustrating the cycle of SFT, CTR Alignment, and Periodic Updates.
Evaluation Highlights
  • Achieves up to 60%+ improvement in CTR (Click-Through Rate) compared to LLM baselines in Baidu's conversational search services
  • Unified framework successfully deployed across three distinct scenarios (suggestion, completion, clarification) within a large-scale commercial system
Breakthrough Assessment
7/10
Novel application of alignment (DPO) specifically for CTR maximization in query recommendation. Significant commercial deployment claims (60% CTR boost), though detailed experimental breakdowns are missing from the provided text.
×