Evaluation Setup
Non-IID Federated Learning simulations with covariate shift (different domains) and label shift (class imbalance).
Benchmarks:
- CIFAR-10 (Image Classification)
- CheXpert (Medical Image Classification)
- Office-Home (Domain Adaptation / Object Recognition)
- CIFAR-10-C (Robustness / OOD Evaluation)
Metrics:
- Test Accuracy (Personalized)
- OOD Generalization Accuracy (CIFAR-10-C)
- Statistical methodology: Reported mean and standard deviation over 3 runs.
Key Results
| Benchmark |
Metric |
Baseline |
This Paper |
Δ |
| Main comparison on standard personalized performance across different datasets and non-IID settings. |
| CheXpert |
Test Accuracy |
83.69 |
87.74 |
+4.05
|
| CIFAR-10 (Covariate Shift) |
Test Accuracy |
89.67 |
91.22 |
+1.55
|
| Office-Home |
Test Accuracy |
73.23 |
74.80 |
+1.57
|
| Out-of-Distribution (OOD) generalization results demonstrate robustness to distribution shifts. |
| CIFAR-10-C |
Test Accuracy |
81.65 |
86.88 |
+5.23
|
| Privacy-Utility trade-off results. |
| CIFAR-10 (DP-FL, epsilon=1) |
Test Accuracy |
83.21 |
89.26 |
+6.05
|
Main Takeaways
- PerAda consistently outperforms both full-model (Ditto) and partial-model (FedPer, FedRep) personalization methods across natural and medical domains.
- The generalization gap is most significant in out-of-distribution (OOD) settings (CIFAR-10-C), validating the hypothesis that knowledge distillation improves robustness.
- The method is highly parameter-efficient (updating ~12% params), which also translates to better utility under Differential Privacy constraints because less noise is added (smaller dimension).
- Ablation studies confirm that both the Adapter mechanism and the Knowledge Distillation component are necessary; removing KD drops performance significantly.