Conversational Recommender Systems (CRS)Explainable Recommendation
COMPASS enhances conversational recommender systems by aligning knowledge graph embeddings with LLMs to generate interpretable, natural language summaries of user preferences from dialogue history.
Core Problem
Existing CRSs rely on latent vector representations for user preferences, which are opaque and lack explainability, while LLMs struggle to reason over domain-specific knowledge graphs due to the modality gap between structured graphs and unstructured text.
Why it matters:
Vector-based preferences hide the 'why' behind recommendations, reducing system transparency and user trust.
LLMs hallucinate or miss domain-specific item attributes (like specific actors or genres) without grounded knowledge from KGs.
Current methods fail to perform cross-modal reasoning, unable to effectively synthesize dynamic dialogue history with static, structured knowledge graph data.
Concrete Example:In a movie recommendation dialogue, a user might implicitly prefer 'sci-fi movies with time travel.' A standard CRS represents this as a hidden vector [0.2, -0.5, ...]. COMPASS explicitly generates the text: 'The user enjoys science fiction films featuring time travel elements,' allowing the system to verify and explain its subsequent recommendations.
Key Novelty
Two-stage Cross-Modal Alignment for Preference Summarization
First, aligns the Knowledge Graph space with the LLM space via 'graph entity captioning,' teaching the LLM to translate graph embeddings into text descriptions.
Second, employs 'knowledge-aware instruction tuning' to teach the LLM to synthesize dialogue history and KG-augmented context into structured preference summaries.
Architecture
The overall architecture and two-stage training process of COMPASS.
Evaluation Highlights
COMPASS improves recommendation performance when plugged into existing CRS models (results implied by 'demonstrate effectiveness' claim, specific numbers not provided in snippet).
Generates human-readable preference summaries that capture both overall preferences and current interests.
Successfully bridges the modality gap, enabling LLMs to reason over structured KG data without architectural modifications to the base CRS.
Breakthrough Assessment
7/10
Novel approach to bridging the KG-LLM modality gap for explainability. While the core idea of using LLMs for summaries is established, the specific two-stage alignment and plug-and-play gating mechanism for existing CRSs is a strong contribution.
⚙️ Technical Details
Problem Definition
Setting: Conversational Recommendation where user preferences must be inferred from dialogue history and Knowledge Graphs to generate both explanations and recommendations.
Inputs: Dialogue history H_t up to turn t, and a Knowledge Graph G = (E, A, X)
Outputs: Textual user preference summary P_t and recommended items I_t
Pipeline Flow
Graph Encoder (R-GCN processing KG)
Graph-to-Text Adapter (Projecting embeddings to LLM space)
LLM Reasoning (Generating preference summary)
Integration Module (Encoding summary and gating with base CRS)
System Modules
Graph Encoder (Input Processing)
Encode structural information from the Knowledge Graph into entity embeddings
Model or implementation: Relational Graph Convolutional Network (R-GCN)
Graph-to-Text Adapter (Input Processing)
Project graph embeddings into the LLM's semantic space
Model or implementation: Linear Projection Layer
Large Language Model
Synthesize dialogue and KG info to generate textual preference summaries
Model or implementation: Unspecified LLM (compatible with state-of-the-art)
Preference Encoder & Gating
Inject generated preferences into base CRS models
Model or implementation: BERT encoder + Adaptive Gating Mechanism
Novel Architectural Elements
Graph-to-Text Adapter bridge specifically trained via entity captioning to align KG embeddings with LLM token space
Adaptive gating mechanism to fuse natural language preference summaries (encoded by BERT) with latent vectors from arbitrary base CRS models
Modeling
Base Model: Compatible with various LLMs (specific model not named in snippet)
Entity Captioning data: Pairs of (Entity Embedding, Text Description)
Instruction Tuning data: Pairs of (Dialogue+KG Context, Ground Truth Preference Summary generated by advanced LLM e.g. ChatGPT)
Compute: Not reported in the paper
Comparison to Prior Work
vs. KBRD/KGSF: COMPASS generates explicit natural language preference summaries rather than opaque latent vectors.
vs. DialoGPT: COMPASS integrates structured KG knowledge via a dedicated adapter, whereas DialoGPT relies solely on unstructured text training.
vs. MemoCRS: COMPASS leverages LLM reasoning for preference generation rather than just memory mechanisms [not cited in paper].
Limitations
Dependency on external advanced LLMs (like ChatGPT) for constructing ground-truth preference summaries.
Two-stage training process adds complexity compared to end-to-end differentiable models.
The quality of the preference summary is heavily dependent on the quality and completeness of the underlying Knowledge Graph.
Reproducibility
Code availability is not explicitly provided in the text. The method uses an advanced LLM (e.g., ChatGPT) to generate ground-truth preference summaries for training, creating a dependency on proprietary models for data construction.
📊 Experiments & Results
Evaluation Setup
Conversational recommendation using benchmark datasets
Benchmarks:
Not specifically named in text (Conversational Recommendation)
Metrics:
Not explicitly reported in the paper
Statistical methodology: Not explicitly reported in the paper
Main Takeaways
The paper claims to demonstrate effectiveness on benchmark datasets, but specific numeric results are not contained in the provided text snippet.
Qualitative analysis shows the model can generate structured summaries containing 'Reasoning', 'Overall Preferences', 'Current Interests', and 'Recommendation'.
The adaptive gating mechanism allows the system to balance between the base CRS's latent representation and the LLM's explicit preference summary.
📚 Prerequisite Knowledge
Prerequisites
Conversational Recommender Systems (CRS)
Knowledge Graphs (KG) and Graph Neural Networks (GNN)
Large Language Models (LLM) and Instruction Tuning
Key Terms
CRS: Conversational Recommender System—systems that elicit user preferences through multi-turn natural language dialogue.
KG: Knowledge Graph—structured representation of data (entities and relationships), used here to represent items and attributes.
R-GCN: Relational Graph Convolutional Network—a type of neural network designed to handle multi-relational graph data.
Modality Gap: The representational mismatch between structured graph data (nodes/edges) and unstructured natural language tokens.
Graph Entity Captioning: A pre-training task where the model learns to generate natural language descriptions from graph entity embeddings.
Instruction Tuning: Fine-tuning an LLM on datasets formatted as instructions (input) and desired responses (output) to improve task performance.
NLL: Negative Log-Likelihood—a loss function used to train language models by maximizing the probability of the correct next token.