← Back to Paper List

Exact and Efficient Unlearning for Large Language Model-based Recommendation

Zhiyu Hu, Yang Zhang, Minghao Xiao, Wenjie Wang, Fuli Feng, Xiangnan He
University of Science and Technology of China, National University of Singapore
arXiv (2024)
Recommendation P13N

📝 Paper Summary

LLM-based Recommendation (LLMRec) Machine Unlearning Privacy in LLMs
APA partitions recommendation data to train separate LoRA adapters and aggregates them at inference time using a sample-adaptive strategy, allowing exact unlearning by retraining only affected adapters.
Core Problem
Existing unlearning methods for LLMs are either computationally expensive (full retraining) or approximate (incomplete data erasure), making them unsuitable for LLM-based recommendation systems where exact removal of user behavior data is required.
Why it matters:
  • LLMs fine-tuned on recommendation data risk leaking sensitive user history, violating privacy regulations like GDPR
  • Retraining the entire LLM for every deletion request is computationally prohibitive due to billions of parameters
  • Approximate unlearning methods do not guarantee the complete removal of unusable data required for strict privacy compliance
Concrete Example: If a user revokes consent for their click history on 'Inception', a standard TALLRec model retains this knowledge in its fine-tuned weights. To remove it, one must typically retrain the whole model or use approximate methods that might still leak the preference.
Key Novelty
Adapter Partition and Aggregation (APA)
  • Partition training data into balanced shards based on semantic clusters and train a separate LoRA adapter for each shard
  • Achieve exact unlearning by retraining only the specific sub-adapter containing the deleted data, drastically reducing computational cost
  • During inference, aggregate weights from all sub-adapters into a single adapter using a sample-adaptive attention mechanism based on validation performance
Evaluation Highlights
  • Maintains recommendation performance comparable to a standard (non-partitioned) TALLRec model across two real-world datasets
  • Achieves 100% exact unlearning (by definition) through the retraining-based design
  • Significantly reduces unlearning cost compared to full retraining (proportional to the number of shards K)
Breakthrough Assessment
7/10
First framework to address exact unlearning specifically for LLMRec. Effectively balances the trade-off between unlearning efficiency and recommendation performance using a novel aggregation strategy.
×