← Back to Paper List

Adaptive Test-Time Personalization for Federated Learning

Wenxuan Bao, Tianxin Wei, Haohan Wang, Jingrui He
University of Illinois Urbana-Champaign
Neural Information Processing Systems (2023)
P13N Benchmark

📝 Paper Summary

Federated Learning (FL) Test-Time Adaptation (TTA)
ATP improves federated learning generalization to unseen clients by learning module-specific adaptation rates from source clients, enabling targeted test-time adaptation without labeled data.
Core Problem
Standard federated learning (FL) struggles to generalize to new clients with distinct distributions, and existing Test-Time Adaptation (TTA) methods are brittle because they pre-define which modules to adapt, failing when distribution shifts vary (e.g., feature vs. label shift).
Why it matters:
  • Real-world FL clients (e.g., mobile users) often lack labeled data for personalization, rendering supervised personalization methods unusable
  • Existing TTA methods trade off performance: adapting Batch Norm improves feature shift but hurts label shift, while adapting the classifier does the reverse
  • Overlooking the interrelationships among multiple source domains leads to suboptimal generalization in standard TTA approaches
Concrete Example: Under label shift, adapting Batch Norm layers (standard in methods like Tent) degrades accuracy because aligning feature distributions harms class separability. Conversely, adapting the linear head helps label shift but fails under feature corruption. ATP automatically learns to use negative adaptation rates for BN under label shift and positive rates for feature shift.
Key Novelty
Adaptive Test-time Personalization (ATP)
  • Treats the 'adaptation rate' (learning rate) of every module in the network as a learnable parameter meta-learned during training on source clients
  • During training, source clients simulate unsupervised test-time adaptation and then use their labels to refine these adaptation rates, minimizing the post-adaptation loss
  • Introduces a cumulative moving average mechanism for online test-time adaptation to solve the batch dependency problem, where early batches suffer from weaker models
Architecture
Architecture Figure Algorithm 1 & 2
The meta-learning training process on source clients and the inference adaptation process on target clients.
Evaluation Highlights
  • +9.37% accuracy improvement over best baseline (MEMO) on CIFAR-10 under complex 'hybrid shift' (simultaneous feature corruption and label shift)
  • Rank 1.0 performance consistency across feature, label, and hybrid shifts, whereas baselines like SHOT and Tent fluctuate drastically (e.g., Rank 9.3 and 8.3)
  • Outperforms state-of-the-art domain generalization methods on Digits-5 (+4.1% vs Surgical Fine-Tuning on SVHN domain) and PACS benchmarks
Breakthrough Assessment
8/10
Significantly advances FL generalization by solving the 'what to adapt' problem in TTA. The method is simple, theoretically grounded, and empirically dominant across diverse shifts where baselines fail.
×