SynthTRIPs: A Knowledge-Grounded Framework for Benchmark Query Generation for Personalized Tourism Recommenders

📝 Paper Summary

Synthetic Data Generation Tourism Recommender Systems (TRS) Knowledge-Grounded Generation

SynthTRIPs generates synthetic, personalized travel queries by grounding Large Language Models in a factual knowledge base to incorporate complex constraints like sustainability and budget without hallucination.

Core Problem

Public travel datasets lack depth and nuance, failing to capture specific preferences like sustainability, walkability, or strict budget constraints, which limits the development of personalized recommender systems.

Why it matters:

Current datasets focus mainly on popular cities and generic queries, ignoring the shift toward sustainable and off-peak tourism
Research on advanced personalization is hindered because real-world data with rich user annotations (personas) and contextual filters is scarce and privacy-sensitive

Concrete Example: A standard dataset might handle 'hotels in Paris', but fails to support a query like 'Find a low-budget, walkable city in Europe with unusual museums or a hidden, alternative nightlife scene', as it lacks specific attribute annotations for walkability and 'alternative' tags.

Key Novelty

Knowledge-Grounded Persona-Based Query Generation

Combines rich user personas (from PersonaHub) with structured constraint filters (budget, sustainability, seasonality) to define specific travel scenarios
Retrieves valid city attributes from a verified Knowledge Base (KB) *before* generation to force the LLM to include only factual, existing destination details

Architecture

The SynthTRIPs pipeline for generating synthetic travel queries. It illustrates the three main blocks: Persona Hub, Travel Filters, and Contextual Prompting.

Evaluation Highlights

Generated a resource of 2,302 valid key functions (combinations of personas and filters) for European city trips
Refined 200 diverse user personas from a pool of ~200k using BERTopic modeling to ensure broad coverage of traveler types (e.g., budget student vs. luxury family)

Breakthrough Assessment

7/10

Valuable resource paper addressing a specific gap (sustainability/personalization data in tourism). While the methodology is a straightforward application of LLM grounding, the resulting dataset enables new research directions.

⚙️ Technical Details

Problem Definition

Setting: Synthetic data generation for Recommender Systems

Inputs: User persona p, Set of travel filters f (budget, sustainability, etc.), Knowledge Base KB

Outputs: Natural language user query q

Pipeline Flow

Data Preparation: Persona Selection & KB Construction
Constraint Definition: Filter Selection (Group: Configuration)
Retrieval: KB Querying (Group: Configuration)
Generation: Prompt Construction & LLM Inference (Group: Generation)
Validation: Output Parsing (Group: Post-processing)

System Modules

Persona Selector (Configuration)

Selects a representative user profile from 200 clustered personas

Model or implementation: BERTopic (for clustering)

Filter Engine (Configuration)

Defines constraints for the trip

Model or implementation: Rule-based

KB Retriever (Configuration)

Queries the KB to find cities satisfying the filter set f

Model or implementation: Structured Database Query

Query Generator

Generates the natural language travel query based on persona and valid facts

Model or implementation: llama-3.2-90b OR gemini-1.5-pro

Novel Architectural Elements

Pre-generation grounding: The system retrieves valid entity attributes (cities matching filters) *before* prompting the LLM, strictly constraining the generation context to avoid hallucination

Modeling

Base Model: llama-3.2-90b and gemini-1.5-pro

Compute: Not reported in the paper

Comparison to Prior Work

vs. TravelPlanner: SynthTRIPs focuses on *query generation* with diverse personas (200 types) rather than just planning agents
vs. Reddit Q&A: SynthTRIPs provides ground-truth alignment with structured filters (budget, AQI) which are often implicit or missing in scraped data
vs. FeB4RAG [not cited in paper]: FeB4RAG generates synthetic queries for general RAG; SynthTRIPs is domain-specialized for tourism with specific entities (cities) and constraints (seasonality)

Limitations

Currently limited to European cities; expanding requires updating the underlying Knowledge Base
Relies on the quality of the underlying LLM (Llama-3/Gemini) for stylistic nuances
Evaluation results (Section 5) are missing from the provided text snippet

Reproducibility

Code: https://bit.ly/synthTRIPs

All components are publicly available: (1) Structured KB of European cities, (2) Generated query dataset, (3) Code for generation (Colab notebooks) and evaluation. Repo: https://bit.ly/synthTRIPs.

📊 Experiments & Results

Evaluation Setup

Validation of synthetic resource quality

Benchmarks:

SynthTRIPs Dataset (Synthetic Data Generation) [New]

Metrics:

Groundedness (Factual Correctness)
Persona Alignment
Sustainability Compliance
Statistical methodology: Not explicitly reported in the paper

Main Takeaways

The framework successfully automates the creation of diverse travel queries by pairing 200 distinct personas with 2,302 valid filter combinations
Grounding LLM generation in a curated Knowledge Base ensures that complex queries (e.g., 'sustainable city with good nightlife') map to real-world locations, preventing the hallucination common in naive LLM prompting
The resource specifically addresses the 'depth' gap in tourism datasets by explicitly annotating sustainability factors like walkability, air quality, and off-peak travel suggestions

📚 Prerequisite Knowledge

Prerequisites

Recommender Systems (specifically Tourism/Travel domain)
Large Language Models (LLMs) and Prompt Engineering
In-Context Learning (ICL)

Key Terms

TRS: Tourism Recommender Systems—AI systems designed to suggest travel destinations or itineraries based on user preferences

KB: Knowledge Base—a structured database containing factual attributes (e.g., city costs, air quality, walkability scores) used to ground the model

Hallucination: A failure mode where an LLM generates plausible-sounding but factually incorrect information (e.g., inventing a museum that doesn't exist)

ICL: In-Context Learning—providing examples within the prompt (e.g., few-shot) to guide the model's output style without updating its weights

PersonaHub: A large-scale dataset of diverse synthetic user profiles (personas) used to seed the generation process

BERTopic: A topic modeling technique that uses transformers (BERT) and class-based TF-IDF to cluster documents (here, personas) into coherent topics

Walkability Index: A metric quantifying how friendly an area is to walking, used here as a sustainability filter