PerFedRLNAS: One-for-All Personalized Federated Neural Architecture Search

📝 Paper Summary

Personalized Federated Learning Federated Neural Architecture Search

PerFedRLNAS uses reinforcement learning to automatically search for and assign personalized neural architectures to different federated learning clients, optimizing for both local accuracy and hardware constraints without manual design.

Core Problem

Existing personalized federated learning methods rely on manual, fixed definitions of which model parts to share vs. personalize, often leading to suboptimal performance or failure to handle hardware heterogeneity.

Why it matters:

Manual design of shared/personalized layers is brittle and fails to adapt to diverse data distributions (non-i.i.d. data) across clients
System heterogeneity (different memory/compute budgets) requires different architectures per client, which standard fixed-model approaches cannot provide
Previous Federated NAS (Neural Architecture Search) methods are often inefficient or only search for a single global model, missing the benefits of personalization

Concrete Example: In a federated setting with heterogeneous devices, a small IoT device might crash trying to train a standard ResNet, while a powerful server is underutilized. Furthermore, if Client A has images of cars and Client B has images of animals, forcing them to share the exact same classifier head (or just fine-tuning the head manually) might yield worse accuracy than letting an algorithm automatically decide that Client A needs a deeper convolutional backbone than Client B.

Key Novelty

Personalized Federated Neural Architecture Search via Reinforcement Learning (PerFedRLNAS)

Maintains a 'virtual agent' for each client on the server that learns a policy to sample architectures from a shared supernet
Uses policy gradient updates driven by client-specific rewards (accuracy, latency, memory) to automatically tailor the architecture structure for each client
Eliminates the need for separate search and training phases by integrating architecture search directly into the federated communication rounds

Architecture

Overview of the PerFedRLNAS workflow including server-side supernet management and client-side training.

Evaluation Highlights

Achieves 85.02% accuracy on CIFAR-10 (ViT), outperforming state-of-the-art FedTP by +4.75% and FedAvg by +12.8%
Improves accuracy on the difficult CIFAR-100 task (ViT) to 65.08%, surpassing the best baseline (FedBABU) by +10.73%
Reduces total elapsed time to reach convergence compared to FedAvg (28.14h vs 34.95h on CIFAR-10) while achieving higher accuracy

Breakthrough Assessment

8/10

Significantly outperforms manually designed personalization baselines and offers a unified framework for handling both data and system heterogeneity via automated NAS.

⚙️ Technical Details

Problem Definition

Setting: Personalized Federated Learning with K clients, each having a local objective function f_i(w_i)

Inputs: Client local datasets (non-i.i.d.), Server-side Supernet

Outputs: Personalized model architecture and weights w_i for each client i

Pipeline Flow

Server: Virtual Agent Selection -> Client: Local Training -> Server: Aggregation & Policy Update

System Modules

Virtual Agent (Server)

Maintains architecture parameters (alpha) for specific clients and samples a discrete architecture

Model or implementation: Softmax Policy over Search Space Dimensions

Client Trainer

Trains the received personalized model on local private data

Model or implementation: Sampled Sub-network (ViT or CNN)

Policy Updater (Server)

Updates the client's architecture preference policy based on reported performance

Model or implementation: Policy Gradient (REINFORCE)

Novel Architectural Elements

Deployment of 'Virtual Agents' on the server side to handle architecture search for clients, keeping client computation purely for standard training
Integration of hardware constraints (latency, memory) directly into the per-client reward function to shape architecture topology

Modeling

Base Model: Supernets based on NASViT (Vision Transformer) or MobileNetV3/DARTS (CNN)

Training Method: Federated Learning with Reinforcement Learning (Policy Gradient)

Objective Functions:

Purpose: Minimize local loss on client data.

Formally: min sum(f_i(w_i))
Purpose: Maximize expected reward for architecture selection policy.

Formally: grad(J) = r_i * grad(log(p_i(v_i)))
Purpose: Balance accuracy with efficiency/constraints in reward.

Formally: r_i(w_i) = Acc_i - Acc_avg - lambda * (RoundTime_i - min(RoundTime)) - MemoryPenalty_i

Adaptation: Personalized architecture sampling via alpha parameters

Trainable Parameters: Supernet weights (via aggregation) and Architecture policy parameters (alpha)

Training Data:

CIFAR-10 and CIFAR-100
Non-i.i.d. partition using Dirichlet distribution (alpha=0.1, 0.3)

Key Hyperparameters:

participation_rate: 5% (5 out of 100 clients per round)
local_epochs: 5
upload_download_rate: 100Mbps
+ 1 more
lambda_time: Used to regulate accuracy and round time scale (value not explicitly listed in main text, likely tuned)

Compute: Single GPU for server simulation; Clients simulated as constrained devices. CIFAR-10 training takes ~28 hours.

Comparison to Prior Work

vs. FedRep/FedBABU: PerFedRLNAS personalizes the entire architecture topology, not just specific layers/heads
vs. FedTP: PerFedRLNAS is search-space agnostic (works for CNNs and ViTs) and handles system heterogeneity (latency/memory)
vs. FedNAS: Searches for N distinct personalized architectures instead of 1 global architecture

Limitations

Relies on a supernet approach, which requires the supernet to be designed beforehand
Training multiple personalized architectures might still incur higher aggregate computational cost on the server than simple aggregation
Validation of memory constraints is simulated; real-world deployment on diverse hardware might face unforeseen compilation issues
No statistical significance tests reported for the accuracy improvements

Reproducibility

Code: https://github.com/TL-System/plato/tree/main/examples/model_search/pfedrlnas

Code is publicly available at https://github.com/TL-System/plato/tree/main/examples/model_search/pfedrlnas. Implementation uses the Plato federated learning framework. Datasets (CIFAR) are public. Specific values for lambda_time in reward function are discussed conceptually but exact numerical settings for all experiments might require checking code.

📊 Experiments & Results

Evaluation Setup

Federated Learning simulation with 100 clients, non-i.i.d. data partitions

Benchmarks:

CIFAR-10 (Image Classification)
CIFAR-100 (Image Classification)

Metrics:

Top-1 Test Accuracy (Average over clients)
Latency / Elapsed Wall-clock Time
Memory Consumption
Statistical methodology: Reported Mean and Standard Deviation over clients. No p-values reported.

Key Results

Benchmark	Metric	Baseline	This Paper	Δ
Performance comparison on NASViT search space shows significant accuracy gains over state-of-the-art PFL methods.
CIFAR-10 (ViT)	Accuracy	80.27	85.02	+4.75
CIFAR-100 (ViT)	Accuracy	54.35	65.08	+10.73
CIFAR-10 (ViT)	Latency (Hours)	34.95	28.14	-6.81
Performance on MobileNetV3 (CNN) search space confirms generalization across model types.
CIFAR-10 (CNN)	Accuracy	75.00	82.02	+7.02
CIFAR-100 (CNN)	Accuracy	48.75	63.85	+15.10

Experiment Figures

Convergence curves (Test Accuracy vs Time) for CIFAR-10 and CIFAR-100.

Memory budget vs. Memory consumption for individual devices.

Main Takeaways

Personalizing the full architecture (One-for-All) yields significantly higher accuracy than personalizing just heads (FedRep) or attention (FedTP), especially on harder tasks like CIFAR-100.
The method demonstrates strong generalization to new clients who did not participate in training, adapting faster than baselines.
Architecture search can be effectively aware of device constraints; models respect memory budgets defined in the reward function.
Implicit clustering occurs: clients with similar data distributions tend to converge to similar architectures without manual clustering.

📚 Prerequisite Knowledge

Prerequisites

Federated Learning (FedAvg algorithm)
Neural Architecture Search (NAS) concepts
Reinforcement Learning (Policy Gradient/REINFORCE)

Key Terms

Supernet: A large, over-parameterized neural network containing all possible sub-architectures (paths) that can be selected during the search process

Non-i.i.d.: Non-independent and identically distributed; refers to data distributions that vary significantly between clients (e.g., one client has only 'cat' images, another only 'dogs')

FedAvg: Federated Averaging—the standard algorithm for FL where client updates are averaged to create a global model

NAS: Neural Architecture Search—automating the design of neural networks

ViT: Vision Transformer—a model architecture based on attention mechanisms, processed as patches rather than convolutions

FedTP: Federated Learning by Transformer Personalization—a baseline method that customizes attention mechanisms

FedRep: A personalized FL method that learns a global representation but personalized classifier heads

Policy Gradient: An RL technique that optimizes a policy by adjusting parameters in the direction that increases expected reward

Search Space: The set of all possible architectures that can be derived from the supernet