PFL: Personalized Federated Learning—training client-specific models rather than a single global model to handle heterogeneous data
Partial model personalization: Updating only specific parameters (e.g., heads, normalization) locally while aggregating others, reducing cost and preserving privacy
ViT: Vision Transformer—a neural network architecture that processes images as sequences of patches using self-attention mechanisms
Prefixes: Learnable vectors appended to the Key and Value matrices in self-attention layers to steer the model's behavior without changing its weights
label skew: A type of non-IID data where the distribution of labels varies across clients (e.g., Client A has only cats, Client B has only dogs)
concept skew: A type of non-IID data where the same label looks different across clients (e.g., 'dog' in photos vs. 'dog' in sketches)
non-IID: Non-Independent and Identically Distributed—data distributions that differ between clients
Parallel adapter: A small neural network module (down-projection, activation, up-projection) used here to generate prefixes stably
FedAvg: Federated Averaging—the standard algorithm where a server averages model weights from multiple clients
APFL: Adaptive Personalized Federated Learning—a method that mixes a global model and a local model using an adaptive coefficient