← Back to Paper List

TEARS: Textual Representations for Scrutable Recommendations

Emiliano Penaloza, Olivier Gouvert, Haolun Wu, Laurent Charlin
Université de Montréal, Mila - Quebec AI Institute, McGill University, HEC Montréal
arXiv (2024)
Recommendation P13N Benchmark

📝 Paper Summary

Scrutable Recommender Systems Collaborative Filtering with LLMs
TEARS aligns LLM-generated textual user summaries with collaborative filtering embeddings using optimal transport, enabling users to edit their preferences in natural language while maintaining high recommendation performance.
Core Problem
Traditional recommender systems use high-dimensional numeric embeddings that are opaque (hard to interpret) and offer limited control to users, who can typically only influence recommendations through coarse interactions like clicks.
Why it matters:
  • Users lack transparency into why items are recommended, as numeric latent vectors are uninterpretable.
  • Correcting bad recommendations is tedious; users must consume/rate more items hoping for a change rather than directly editing their profile.
  • Existing scrutable methods (like tag clouds) impose high cognitive load or sacrifice performance compared to black-box models.
Concrete Example: A user wants to stop receiving horror movie recommendations. In standard systems, they must laboriously rate horror movies negatively. In TEARS, they can simply delete 'Horror' from their textual summary or add 'I dislike scary movies' to immediately update the recommendations.
Key Novelty
TExtuAl Representations for Scrutable recommendations (TEARS)
  • Replaces or augments opaque numeric user embeddings with natural language summaries generated by an LLM based on user history.
  • Aligns the textual embedding space with a powerful black-box collaborative filtering space using Optimal Transport (OT) to ensure text edits meaningfully impact recommendations.
  • Allows a tunable mix (convex combination) of text-based and behavior-based representations, letting users trade off between pure controllability and maximum historical accuracy.
Architecture
Architecture Figure Figure 1
The TEARS framework illustrating the dual-encoder setup. It shows how user history is processed by a black-box VAE and user summaries are processed by a text encoder. The two latent representations are aligned via Optimal Transport and combined (via alpha) before decoding.
Evaluation Highlights
  • TEARS-RecVAE outperforms the standard RecVAE baseline on the ML-1M dataset (NDCG@100: 0.444 vs 0.434), showing that adding aligned text representations improves performance.
  • In a 'flip' controllability task (swapping favorite/least-favorite genres), TEARS achieves a 99.7% success rate in shifting recommendations, compared to 0.0% for a genre-tag baseline.
  • Using Optimal Transport for alignment yields significantly better controllability (99.7% Flip Ratio) compared to contrastive loss (24.7%) or JS-divergence (31.3%) baselines.
Breakthrough Assessment
8/10
Successfully bridges the gap between high-performance black-box recommenders and user-controllable text interfaces. The use of Optimal Transport for alignment is a strong methodological contribution.
×