FedAvg: Federated Averaging—the standard algorithm for aggregating local model updates in federated learning
Batch Normalization (BN): A technique to standardize layer inputs; it tracks running mean/variance (statistics) and learns scale/shift parameters (weights)
Meta-net: A small neural network that takes metadata (data statistics) as input and outputs hyperparameters (learning rates, mixing weights)
Hypergradient: The gradient of the validation loss with respect to the hyperparameters, used to update the meta-nets
Implicit Function Theorem (IFT): A mathematical tool used to compute gradients of the optimal model parameters with respect to hyperparameters without unrolling the entire training loop
Sparsity: In this context, the percentage of model parameters assigned a learning rate of 0 by the meta-net (effectively freezing them)
Label Shift: Differences in the distribution of labels across clients (e.g., one client has only cats, another only dogs)
Feature Shift: Differences in the distribution of input features for the same labels (e.g., photos vs. sketches)
ARI: Adjusted Rand Index—a measure of the similarity between two data clusterings