← Back to Paper List

SMAT: Staged Multi-Agent Training for Co-Adaptive Exoskeleton Control

Yifei Yuan, Ghaith Androwis, Xianlian Zhou
Department of Biomedical Engineering
arXiv (2026)
RL Agent P13N

📝 Paper Summary

Human-Robot Interaction Exoskeleton Control Sim-to-Real Reinforcement Learning
SMAT uses a four-stage curriculum to train human and exoskeleton agents sequentially—stabilizing the human's gait before introducing robotic assistance—to solve the non-stationary co-adaptation problem.
Core Problem
Exoskeleton assistance is a non-stationary learning problem: as the device learns to assist, the human user simultaneously adapts their motor control, destabilizing the training environment for the robot.
Why it matters:
  • Simultaneous joint optimization without structure leads to instability, oscillatory torque outputs, and poorly timed assistance
  • Existing RL approaches do not explicitly model the sequential nature of human motor adaptation (learning to walk, then adapting to weight, then adapting to force)
  • Poor co-adaptation results in increased metabolic cost rather than the intended physical augmentation
Concrete Example: If a hip exoskeleton and a human model train simultaneously from scratch (Stage 4 only), the exoskeleton exploits the human's instability by learning to output near-zero torque to minimize energy penalties, resulting in an 83% reduction in assistive torque compared to the staged approach.
Key Novelty
Staged Multi-Agent Training (SMAT)
  • Decomposes the co-adaptation problem into four sequential stages: (1) Human learns to walk, (2) Human adapts to device mass, (3) Exoskeleton learns assistance on frozen human, (4) Joint co-adaptation.
  • Uses a 'frozen agent' strategy where one partner's policy is fixed while the other learns, preventing the 'moving target' problem inherent in simultaneous multi-agent learning.
Evaluation Highlights
  • 10.1% average reduction in hip muscle activation across 26 simulated muscles compared to the unassisted condition
  • Achieved 23.8 W mean positive power at 9.3 Nm RMS torque in real-world treadmill experiments with 5 subjects
  • Zero-shot generalization across walking speeds (0.6, 1.2, 1.8 m/s) using only hip kinematic inputs, maintaining consistent peak assistive torque
Breakthrough Assessment
8/10
Strong methodological contribution applying curriculum MARL to biomechanics. The 4-stage breakdown offers a logical solution to the co-adaptation instability problem, backed by both sim and real-world human validation.
×