← Back to Paper List

SynthTRIPs: A Knowledge-Grounded Framework for Benchmark Query Generation for Personalized Tourism Recommenders

Ashmi Banerjee, Adithi Satish, Fitri Nur Aisyah, Wolfgang Wörndl, Yashar Deldjoo
Technical University of Munich, Polytechnic University of Bari
arXiv (2025)
P13N Recommendation Factuality Benchmark KG

📝 Paper Summary

Synthetic Data Generation Tourism Recommender Systems (TRS) Knowledge-Grounded Generation
SynthTRIPs generates synthetic, personalized travel queries by grounding Large Language Models in a factual knowledge base to incorporate complex constraints like sustainability and budget without hallucination.
Core Problem
Public travel datasets lack depth and nuance, failing to capture specific preferences like sustainability, walkability, or strict budget constraints, which limits the development of personalized recommender systems.
Why it matters:
  • Current datasets focus mainly on popular cities and generic queries, ignoring the shift toward sustainable and off-peak tourism
  • Research on advanced personalization is hindered because real-world data with rich user annotations (personas) and contextual filters is scarce and privacy-sensitive
Concrete Example: A standard dataset might handle 'hotels in Paris', but fails to support a query like 'Find a low-budget, walkable city in Europe with unusual museums or a hidden, alternative nightlife scene', as it lacks specific attribute annotations for walkability and 'alternative' tags.
Key Novelty
Knowledge-Grounded Persona-Based Query Generation
  • Combines rich user personas (from PersonaHub) with structured constraint filters (budget, sustainability, seasonality) to define specific travel scenarios
  • Retrieves valid city attributes from a verified Knowledge Base (KB) *before* generation to force the LLM to include only factual, existing destination details
Architecture
Architecture Figure Figure 1
The SynthTRIPs pipeline for generating synthetic travel queries. It illustrates the three main blocks: Persona Hub, Travel Filters, and Contextual Prompting.
Evaluation Highlights
  • Generated a resource of 2,302 valid key functions (combinations of personas and filters) for European city trips
  • Refined 200 diverse user personas from a pool of ~200k using BERTopic modeling to ensure broad coverage of traveler types (e.g., budget student vs. luxury family)
Breakthrough Assessment
7/10
Valuable resource paper addressing a specific gap (sustainability/personalization data in tourism). While the methodology is a straightforward application of LLM grounding, the resulting dataset enables new research directions.
×