Latent Inter-User Difference Modeling for LLM Personalization

📝 Paper Summary

User-profile based personalization Conversational personalization

DEP improves personalization by modeling the differences between a user and their peers as contrastive latent embeddings, distilled via a sparse autoencoder and injected as soft prompts.

Core Problem

Existing personalization methods (like DPL) rely on natural language to compare users, which creates verbose prompts that strain context windows and fail to precisely capture fine-grained behavioral distinctions.

Why it matters:

Reliance on raw text for user comparison is structurally ill-suited for extraction, as LLMs often miss subtle distinctions when summarizing differences in natural language
Including raw peer data in prompts consumes excessive tokens, limiting the model's ability to process other relevant context
Effective personalization requires capturing not just what a user likes, but specifically how their preferences deviate from the norm (individuality)

Concrete Example: In DPL, an LLM is given raw reviews from User A and User B and asked to 'describe the difference' in text. This yields vague summaries. DEP instead calculates the mathematical difference between their embeddings (Vector A - Vector B), filters noise, and injects this precise signal directly.

Key Novelty

Difference-aware Embedding-based Personalization (DEP)

Shifts inter-user comparison from natural language space to latent embedding space, using vector subtraction to capture behavioral deviations
Employs a Sparse Autoencoder (SAE) to distill these difference vectors, filtering out noise and retaining only task-relevant preference signals
Injects these distilled signals as soft prompts into a frozen LLM, aligning the compressed representations with the LLM's internal understanding via fine-tuning

Architecture

Overview of the DEP framework, illustrating the flow from history retrieval to soft prompt injection

Breakthrough Assessment

7/10

Proposes a logical shift from text-based to latent-based user comparison, addressing context window and precision issues. usage of SAE for filtering personalization signals is a clever architectural addition.

⚙️ Technical Details

Problem Definition

Setting: Personalized text generation (specifically review generation) given user history and peer behaviors

Inputs: Target user u', target item i', user history D_u', and peer histories

Outputs: Generated text y (e.g., a review) aligned with user preferences

Pipeline Flow

Retrieval (Select representative user history)
Embedding Construction (Encode user review and peer reviews)
Difference Calculation (Compute difference-aware embeddings)
Distillation (SAE compresses embeddings)
Generation (Inject soft prompts into frozen LLM)

System Modules

History Retriever

Retrieve N key interactions from the user's history to serve as anchors

Model or implementation: Retriever (generic)

Embedding Encoder (Representation Learning)

Encode user reviews and peer reviews into dense vectors

Model or implementation: Frozen text embedding model f_emb

Sparse Autoencoder (SAE) (Representation Learning)

Compress and filter embeddings to retain only task-relevant features via sparsity constraints

Model or implementation: Encoder-Decoder architecture

Projection Network (Generation)

Map latent vectors to the LLM's input embedding space

Model or implementation: Linear projection M_p

LLM Generator (Generation)

Generate the final personalized review

Model or implementation: Frozen LLM

Novel Architectural Elements

Latent difference modeling module that computes vector subtraction (User - Peers) before processing
Integration of a Sparse Autoencoder (SAE) specifically to distill personalization signals into soft prompts

Modeling

Base Model: Not explicitly named in snippet (Generic 'Frozen LLM' referenced)

Training Method: Joint training of SAE and Projection Network using Generation Loss + Reconstruction Loss + Sparsity Loss

Objective Functions:

Purpose: Ensure the LLM generates correct text.

Formally: Standard generation loss L_gen based on ground truth text
Purpose: Ensure the SAE retains essential information.

Formally: Smooth L1 reconstruction loss between input embedding and decoder output
Purpose: Enforce sparsity to filter noise.

Formally: KL divergence between average activation and sparsity target rho
Purpose: Combined objective.

Formally: L = L_gen + lambda * L_recon + gamma * L_sparsity

Training Data:

Amazon Reviews 2023 dataset (Books, Movies & TV, CDs & Vinyl)
Uses most recent interaction for training, random 512 for validation

Compute: Not reported in the paper

Comparison to Prior Work

vs. DPL: DEP models differences in latent space (vectors) rather than natural language space, avoiding verbose prompts and loss of precision
vs. RAG/PAG: DEP explicitly models *inter-user differences* (contrastive) rather than just retrieving the user's own history

Limitations

Relies on the availability of peer users who interacted with the same items (overlap assumption)
Requires training an SAE and projection network (not purely inference-time like standard RAG)
Effectiveness depends on the quality of the frozen text embedding model used for input encoding

Reproducibility

Code: https://github.com/SnowCharmQ/DEP

Code is publicly available at https://github.com/SnowCharmQ/DEP. The dataset used is Amazon Reviews 2023 (preprocessed by DPL). Specific hyperparameters (lambda, gamma) and model sizes are not provided in the snippet.

📊 Experiments & Results

Evaluation Setup

Personalized review generation on Amazon Reviews dataset

Benchmarks:

Amazon Reviews 2023 (Books) (Review Generation)
Amazon Reviews 2023 (Movies & TV) (Review Generation)
Amazon Reviews 2023 (CDs & Vinyl) (Review Generation)

Metrics:

Not explicitly reported in the paper
Statistical methodology: Not explicitly reported in the paper

Main Takeaways

Latent space modeling offers a more compact and precise way to represent inter-user differences compared to natural language prompting
The use of Sparse Autoencoders (SAE) effectively filters out task-irrelevant noise from difference embeddings
Soft prompt injection allows personalization of frozen LLMs without heavy fine-tuning of the base model
The method assumes that comparing a user to peers on *shared* items isolates their unique stylistic and preference deviations

📚 Prerequisite Knowledge

Prerequisites

Understanding of Retrieval-Augmented Generation (RAG)
Concept of Soft Prompts/Continuous Prompt Tuning
Vector embeddings and contrastive learning basics
Autoencoder architectures

Key Terms

DEP: Difference-aware Embedding-based Personalization—the proposed framework that models user differences in latent space

SAE: Sparse Autoencoder—a neural network used here to compress embeddings by enforcing sparsity, keeping only the most informative features

Soft Prompt: Learnable vectors injected into the input sequence of an LLM to guide generation without updating the LLM's weights

DPL: Difference-aware Personalization with Logic—a baseline method that uses LLMs to generate natural language summaries of differences between users

Latent Space: A compressed vector representation of data where mathematical operations (like subtraction) can represent semantic relationships

KL divergence: A statistical measure used here as a loss function to enforce sparsity targets in the autoencoder