← Back to Paper List

FedSelect: Personalized Federated Learning with Customized Selection of Parameters for Fine-Tuning

Rishub Tamirisa, Chulin Xie, Wenxuan Bao, Andy Zhou, Ron Arel, Aviv Shamsian
Lapis Labs, University of Illinois Urbana-Champaign, Bar-Ilan University
Computer Vision and Pattern Recognition (2024)
P13N Benchmark

📝 Paper Summary

Personalized Federated Learning (PFL) Parameter Efficient Fine-Tuning
FedSelect progressively identifies personalized parameters during federated training based on update magnitude, keeping high-variance parameters local to clients while aggregating stable parameters globally.
Core Problem
Standard Federated Learning struggles with heterogeneous client data, and existing personalization methods typically rely on coarse, pre-defined layers (like heads) rather than identifying specific parameters that need adaptation.
Why it matters:
  • Pre-selecting specific layers (e.g., only classifier heads) limits the model's ability to adapt to complex local distributions
  • Parameter importance varies significantly even within the same layer, meaning layer-wise decoupling is suboptimal for balancing global knowledge and local personalization
  • Fixed architectures for personalization fail to account for the unique data distribution needs of individual clients
Concrete Example: In a setup where clients have different label distributions (e.g., CIFAR-10 split by class), standard methods like FedRep force the feature extractor to be global and the head to be local. FedSelect might find that specific neurons *within* the feature extractor are critical for a specific client's unique classes, personalizing those while sharing the rest.
Key Novelty
Iterative Gradient-Based Subnetwork Personalization
  • Hypothesizes that parameters changing the most during local updates are critical for personalization, while stable parameters represent shared global knowledge
  • Iteratively expands a client-specific mask (subnetwork) of personalized parameters based on update magnitude, similar to the Lottery Ticket Hypothesis but for personalization rather than pruning
  • Maintains a 'parameter-wise' rather than 'layer-wise' split, allowing arbitrary subnetworks to be kept local while the rest are aggregated
Evaluation Highlights
  • Outperforms state-of-the-art PFL baselines (FedRep, FedPAC) by significant margins on CIFAR-10 and CIFAR-10-C under heterogeneous settings
  • Achieves superior personalization accuracy on OfficeHome and Mini-ImageNet benchmarks compared to layer-wise decoupling methods
  • Demonstrates robustness to both label distribution shifts and feature distribution shifts (e.g., corruptions in CIFAR-10-C)
Breakthrough Assessment
7/10
Offers a clever, granular approach to parameter decoupling that moves beyond rigid layer-based heuristics. The connection to Lottery Ticket Hypothesis for personalization (keeping vs. pruning) is intuitive and effective.
×