← Back to Paper List

FedL2P: Federated Learning to Personalize

Royson Lee, Minyoung Kim, Da Li, Xinchi Qiu, Timothy Hospedales, Ferenc Huszár, Nicholas D. Lane
University of Cambridge, UK, Samsung AI Center, Cambridge, UK, University of Edinburgh, UK, Flower Labs
arXiv (2023)
P13N Speech Benchmark

📝 Paper Summary

Federated Learning Personalized Federated Learning
FedL2P uses federated meta-learning to train auxiliary networks that map a client's local data statistics to optimal fine-tuning hyperparameters (learning rates and batch norm weights) for personalizing a global model.
Core Problem
Clients in federated learning exhibit varying types of heterogeneity (label shift vs. feature shift), making one-size-fits-all personalization strategies (like freezing specific layers or enforcing specific batch norm usage) suboptimal.
Why it matters:
  • Manual personalization heuristics (e.g., 'always use local Batch Norm') fail when clients differ in how similar they are to the global model
  • Existing HPO methods in FL often learn a single set of hyperparameters for all clients or fail to account for client-specific data distributions during the parameter search
  • New clients joining the network usually require computationally expensive local search to find optimal hyperparameters
Concrete Example: In a setup with both feature and label shift, Client A might benefit from using its own Batch Norm statistics (due to feature shift), while Client B might benefit from the global model's statistics (due to small data size). A standard strategy forces both to use the same setting, hurting at least one client.
Key Novelty
Federated Meta-Learning of Hyperparameter Networks
  • Instead of learning the hyperparameters directly, learn 'meta-nets' (small MLPs) that function as a policy: they take client data statistics as input and output the optimal hyperparameters
  • Decouples the strategy from the client identity: the meta-nets learn to recognize data patterns (e.g., high feature variance) and prescribe the correct adaptation strategy (e.g., high learning rate for BN layers)
  • Enables 'zero-shot' personalization configuration: new clients can generate optimal hyperparameters instantly by passing their data statistics through the pre-trained meta-nets without iterative search
Architecture
Architecture Figure Figure 1
The workflow of FedL2P, showing the interaction between the Global Model, the Personalization Strategy (Meta-nets), and the Client Data.
Evaluation Highlights
  • +25.09% accuracy improvement on Speech Commands (Unseen Clients) compared to standard fine-tuning with client Batch Norm statistics (87.85% vs 62.76%)
  • Outperforms FedBABU and PerFedAvg baselines on CIFAR-10 with high heterogeneity (alpha=0.1), achieving 80.28% vs 79.58% and 77.68% respectively
  • Achieves 88.85% on Office-Caltech-10 compared to 80.97% for standard fine-tuning, effectively handling feature distribution shifts
Breakthrough Assessment
7/10
Strong methodological contribution by applying meta-learning to HPO in FL. The ability to generalize to unseen clients is a significant practical advantage. Results are solid, though improvements on some benchmarks (CIFAR) are marginal compared to the massive gains on others (Speech Commands).
×