← Back to Paper List

UFO: Unfair-to-Fair Evolving Mitigates Unfairness in LLM-based Recommender Systems via Self-Play Fine-tuning

Jiaming Zhang, Yuyuan Li, Xiaohua Feng, Zhifei Ren, Li Zhang, Chaochao Chen
Zhejiang University, Hangzhou Dianzi University, Southeast University
arXiv (2025)
Recommendation RL Pretraining

📝 Paper Summary

LLM-based Recommender Systems (LRS) Item-side Fairness
UFO mitigates item-side unfairness in LLM recommenders by analyzing how supervised fine-tuning amplifies pre-training bias and correcting it via a self-play game between a judger and a corrector.
Core Problem
LLM-based Recommender Systems (LRSs) exhibit severe item-side unfairness because the Supervised Fine-Tuning (SFT) stage reinforces and amplifies inherent biases from the pre-training stage.
Why it matters:
  • Current methods like re-weighting or re-ranking only address bias during SFT, ignoring the root cause in pre-training
  • LRSs exhibit more severe unfairness than traditional models (e.g., SASRec), leading to significant inequality in item exposure for specific groups (e.g., job providers)
  • Existing fairness constraints often degrade the recommendation performance (utility) of the system
Concrete Example: In an empirical study on ML-1M using Llama-2-7b, the covariance between pre-training bias and SFT bias shift was positive (7.73e-4), meaning the fine-tuning process actively reinforced the model's initial preference for dominant genres rather than correcting it.
Key Novelty
Unfair-to-Fair evOlving (UFO) with Self-Play
  • Frames fairness alignment as a two-player game: a 'Judger' identifies unfair outputs relative to ideal distributions, and a 'Corrector' adjusts the model to fool the Judger
  • Identifies that SFT amplifies pre-training bias (positive covariance) rather than just introducing new bias, requiring corrections that address both stages
  • Uses a geometric mixture policy to interpolate between the current and original model, ensuring fairness improvements do not catastrophically degrade recommendation utility
Evaluation Highlights
  • Analysis reveals positive covariance (7.73e-4) between pre-training bias and SFT bias shift on ML-1M, proving SFT amplifies existing inequities
  • Identifies that 7 out of 10 genre groups in Llama-2-7b retained the same bias direction after SFT, confirming the reinforcement hypothesis
Breakthrough Assessment
7/10
Strong analytical contribution identifying 'bias amplification' in LRS. The self-play solution is conceptually novel for this domain. Score limited by lack of visible end-to-end performance metrics in the provided text.
×