Evaluation Setup
Federated Active Learning on image classification tasks under varying degrees of non-IID and global imbalance.
Benchmarks:
- CIFAR-10 (Image Classification)
- Four other benchmarks (Image Classification)
Metrics:
- Mean Test Accuracy
- Area Under Learning Curve (AULC)
- Statistical methodology: Paired analysis using Positive Ratio, One-sided Wilcoxon p-value, and Hodges-Lehmann estimator over 5 random seeds.
Main Takeaways
- The model that achieves more class-balanced sampling (especially for minority classes) consistently leads to better final performance.
- Global model querying is beneficial only when the global distribution is highly imbalanced and client data are relatively homogeneous.
- Local model querying is preferable when the global distribution is balanced or clients are highly heterogeneous (non-IID).
- Global model consistently outperforms local model for diversity-based sampling (e.g., Coreset) due to better feature representations.