Towards Fair Large Language Model-based Recommender Systems without Costly Retraining

📝 Paper Summary

LLM-based Recommender Systems (LLM-RS) Fairness and Bias Mitigation Machine Unlearning

FUDLR mitigates fairness issues in LLM-based recommender systems by efficiently identifying bias-inducing training samples via mask learning and removing their influence through a fast, retraining-free machine unlearning update.

Core Problem

LLM-based recommenders inherit biases (like popularity or attribute bias) from training data, but existing debiasing methods lack generality across bias types and require computationally prohibitive retraining.

Why it matters:

LLMs trained on massive, unaligned datasets often perpetuate stereotypes or over-recommend popular items, harming user experience and niche item visibility
Retraining or fine-tuning large models for every specific fairness constraint is operationally infeasible for dynamic, large-scale systems
Existing solutions are often tailored to single bias types, failing to address the diverse or co-existing biases found in real-world applications

Concrete Example: A recommender trained on historical data might over-recommend blockbuster movies (popularity bias) or systematically suggest lower-paying jobs to certain demographic groups (attribute bias). Current methods would require re-running the expensive fine-tuning process with re-weighted loss to fix this, whereas FUDLR updates the existing model directly.

Key Novelty

Fast Unified Debiasing for LLM-RS (FUDLR)

Reformulates debiasing as a machine unlearning task: instead of retraining, it mathematically estimates how the model parameters would change if biased samples were removed
Uses a learnable 'mask' to identify which specific training samples cause bias, optimizing this mask to balance fairness improvement, accuracy preservation, and sparsity
Decouples bias identification from the unlearning mechanism, allowing the system to target different biases (popularity, gender, etc.) simply by swapping the fairness metric used in the mask objective

Architecture

The overall FUDLR framework illustrating the two-stage process: Bias Identification via Mask Learning and Fast Debiasing via Unlearning.

Evaluation Highlights

Mitigates popularity bias while maintaining recommendation accuracy, outperforming retraining-based baselines in the fairness-accuracy trade-off
Achieves comparable or better debiasing performance than full retraining methods but with significantly lower computational cost (orders of magnitude faster)
Demonstrates generality by effectively reducing both item-side popularity bias and user-side attribute bias (e.g., gender discrimination) using the same framework

Breakthrough Assessment

8/10

Offers a highly practical solution to a critical problem (fairness) in a high-cost domain (LLMs) by successfully applying machine unlearning. The unification of different bias types under one framework is a significant methodological advance.

⚙️ Technical Details

Problem Definition

Setting: Sequential recommendation using fine-tuned LLMs, where the goal is to predict the next item in a user's interaction sequence while satisfying fairness constraints

Inputs: User interaction history sequence S_u converted into a natural language prompt z_k

Outputs: Generated next item prediction i_{n+1}

Pipeline Flow

Pre-trained & Fine-tuned LLM (Initial State)
Bias Identification (Mask Learning)
Fast Debiasing (Unlearning Update)

System Modules

Base LLM-RS

Provides the initial biased recommendation model trained on standard data

Model or implementation: BIGRec (instantiated with LLaMA)

Mask Learner

Learns a probability mask 'm' for each training sample to identify which ones contribute most to bias

Model or implementation: Learnable logits optimized via gradient descent

Unlearning Updater

Computes the parameter update delta_theta to remove the influence of D_unlearn

Model or implementation: Newton-step update using Inverse Hessian approximation

Novel Architectural Elements

Two-stage pipeline separating bias identification (via mask learning) from model correction (via influence functions)
Bias-agnostic mask optimization that accepts any differentiable fairness metric as a plug-in objective

Modeling

Base Model: LLaMA (instantiated via BIGRec framework)

Training Method: Machine Unlearning via Influence Functions (Post-hoc correction)

Objective Functions:

Purpose: Identify bias-inducing samples.

Formally: Minimize L_mask = -lambda_fair * FairnessImprovement + lambda_acc * AccuracyPreservation + lambda_spa * SparsityRegularization
Purpose: Quantify Fairness Improvement.

Formally: Sum of Influence Scores I(z_k, B(theta)) weighted by the mask m_k
Purpose: Preserve Accuracy.

Formally: Sum of training loss L_LLM(z_k) weighted by the mask m_k
Purpose: Update parameters efficiently.

Formally: Delta_theta approx sum(H_theta^-1 * grad(L_LLM(z_k))) for z_k in D_unlearn

Adaptation: LoRA (Low-Rank Adaptation) applied to the base LLM; unlearning targets LoRA parameters

Trainable Parameters: Mask logits (during identification phase); LoRA adapters (during unlearning update)

Key Hyperparameters:

lambda_fair: Controls weight of fairness improvement in mask objective
lambda_acc: Controls weight of accuracy preservation in mask objective
lambda_spa: Controls sparsity of the learned mask

Compute: Significantly lower than retraining; Complexity dominated by Hessian-Vector Product O((n_unlearn + T_cg) * F) rather than full training O(n * Epochs * F)

Comparison to Prior Work

vs. Reweighting: FUDLR avoids costly retraining by using a one-step influence update
vs. Prompt Masking: FUDLR corrects the model parameters directly rather than just altering inputs (which fails for implicit bias)
vs. Counterfactually-Fair-Prompt: FUDLR is a unified framework applicable to multiple bias types (popularity, attribute), whereas CFP is specific to attribute bias and requires adversarial training
+ 1 more
vs. PC-Rec [not cited in paper]: PC-Rec adjusts counterfactuals for fairness; FUDLR uses influence functions for efficiency

Limitations

Relies on the Hessian approximation being accurate, which can be challenging for non-convex loss landscapes of LLMs
Effectiveness depends on the definition of a differentiable fairness metric; vague or complex biases might be hard to formulate
Currently validated on LoRA parameters; scaling to full-parameter unlearning would be computationally heavier due to Hessian size
Influence functions are an approximation (first-order Taylor expansion), which may degrade if the parameter change required is large

Reproducibility

Code: https://github.com/JinLi-i/FUDLR

Code and data are available at https://github.com/JinLi-i/FUDLR. The paper provides theoretical proofs for the unlearning update proposition in Appendix A. Specific hyperparameters (lambda values) for the experiments are not explicitly listed in the main text but are implied to be tunable.

📊 Experiments & Results

Evaluation Setup

Sequential recommendation on real-world datasets, evaluating both item-side (popularity) and user-side (attribute) fairness

Benchmarks:

MovieLens-1M (Sequential Recommendation)
Insurance (Sequential Recommendation)

Metrics:

NDCG@10 (Accuracy)
Hit@10 (Accuracy)
DP (Demographic Parity - Fairness)
EO (Equality of Opportunity - Fairness)
PopBias (Popularity Bias Metric)
Statistical methodology: Not explicitly reported in the paper

Key Results

Benchmark	Metric	Baseline	This Paper	Δ
Experimental results demonstrate FUDLR's ability to improve fairness metrics significantly with minimal impact on accuracy compared to baselines.
MovieLens-1M	PopBias (Lower is better)	Not explicitly reported in text, inferred from improvement	See Table in paper	Paper claims 'effective mitigation', exact numeric delta not extractable from text snippet

Experiment Figures

Illustration of Popularity Bias and Attribute Bias in LLM-RS.

Main Takeaways

FUDLR effectively reduces popularity bias (item-side) and attribute bias (user-side) without retraining.
The method achieves a superior Pareto frontier between fairness and accuracy compared to reweighting and other baselines.
Efficiency analysis confirms that the unlearning update is orders of magnitude faster than retraining, making it practical for LLMs.
The bias-agnostic mask learning successfully identifies specific samples responsible for different types of bias (e.g., popular items vs. demographic imbalances).

📚 Prerequisite Knowledge

Prerequisites

Generative Recommender Systems (LLMs predicting next items)
Machine Unlearning (influence functions)
Hessian-vector products (for efficient parameter estimation)
Parameter-Efficient Fine-Tuning (specifically LoRA)

Key Terms

LLM-RS: Large Language Model-based Recommender Systems—recommenders that use LLMs to generate item predictions from textual history

Machine Unlearning: The process of removing the influence of specific training data points from a trained model without retraining from scratch

Influence Function: A technique from robust statistics that estimates how model parameters would change if a specific training point were up-weighted or removed

LoRA: Low-Rank Adaptation—a technique to fine-tune LLMs by updating only a small set of low-rank matrices while freezing the main weights

Hessian Matrix: A matrix of second-order partial derivatives of the loss function, used here to capture the curvature of the loss landscape for accurate parameter updates

HVP: Hessian-Vector Product—an efficient way to compute the product of the Hessian matrix and a vector without explicitly constructing the massive Hessian matrix

Popularity Bias: The tendency of recommenders to suggest popular items much more frequently than niche items

Attribute Bias: Discrimination in recommendations based on sensitive user attributes like gender or race (demographic parity)

Demographic Parity: A fairness metric ensuring that the probability of a positive outcome (recommendation) is independent of sensitive group membership