Supernet: A large, over-parameterized neural network containing all possible sub-architectures (paths) that can be selected during the search process
Non-i.i.d.: Non-independent and identically distributed; refers to data distributions that vary significantly between clients (e.g., one client has only 'cat' images, another only 'dogs')
FedAvg: Federated Averaging—the standard algorithm for FL where client updates are averaged to create a global model
NAS: Neural Architecture Search—automating the design of neural networks
ViT: Vision Transformer—a model architecture based on attention mechanisms, processed as patches rather than convolutions
FedTP: Federated Learning by Transformer Personalization—a baseline method that customizes attention mechanisms
FedRep: A personalized FL method that learns a global representation but personalized classifier heads
Policy Gradient: An RL technique that optimizes a policy by adjusting parameters in the direction that increases expected reward
Search Space: The set of all possible architectures that can be derived from the supernet