FedAvg: Federated Averaging—standard FL algorithm that aggregates client updates by simple averaging
FedAvg+: Federated Averaging followed by local fine-tuning (similar to Reptile)
non-i.i.d.: Non-independent and identically distributed—data where the distribution of classes varies significantly across clients
logits: Raw, unnormalized predictions generated by the last layer of a neural network before the softmax activation
collaboration graph: A weighted graph where edge weights represent the relevance or similarity between the learning tasks of connected nodes
Wasserstein distance: A distance metric between probability distributions, measuring the minimum 'cost' to transform one distribution into another
star node: A temporarily selected node that coordinates aggregation for a subset of neighbors in a decentralized round (dynamic role)
co-distillation: A collaborative learning method where peer models exchange knowledge by mimicking each other's outputs (predictions) rather than just parameters