← Back to Paper List

HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning

Chunlin Tian, Zhanying Shi, Zhijiang Guo, Li Li, Chengzhong Xu
University of Macau, University of Texas at Austin, University of Cambridge
Neural Information Processing Systems (2024)
Reasoning QA Benchmark

📝 Paper Summary

Parameter-Efficient Fine-Tuning (PEFT) Mixture-of-Experts (MoE)
HydraLoRA improves fine-tuning on diverse datasets by decoupling LoRA's symmetric matrices into a shared input projection and multiple distinct output experts routed dynamically.
Core Problem
Standard LoRA struggles with complex, heterogeneous datasets because a single set of low-rank matrices cannot simultaneously adapt to diverse tasks without interference.
Why it matters:
  • Parameter-efficient methods often underperform full fine-tuning in complex domains, creating a trade-off between cost and quality
  • Task interference in multi-task learning degrades performance when using a single monolithic adapter
  • Existing solutions either require full fine-tuning (expensive) or domain expertise to manually separate tasks
Concrete Example: When fine-tuning a model on a mix of medical, legal, and coding tasks, a standard LoRA module might learn conflicting parameter updates, causing the model to perform sub-optimally on all three compared to separate experts.
Key Novelty
Asymmetric LoRA with MoE Routing
  • Splits the LoRA architecture asymmetrically: a single shared 'A' matrix captures common knowledge across all inputs, while multiple 'B' matrices act as specialized experts for specific sub-domains.
  • Uses an internal router (Mixture-of-Experts style) to dynamically weight the contributions of the 'B' matrices for each input, eliminating the need for manual domain labeling.
Evaluation Highlights
  • Achieves 1.96x training speedup compared to standard LoRA (rank=32) on LLaMA2-7B
  • Reduces energy consumption by 49.6% compared to standard LoRA (rank=32) during fine-tuning
  • Consistently outperforms standard LoRA and LoRA-Split (manually separated heads) across single-domain and multi-task benchmarks (medical, law, math, code)
Breakthrough Assessment
7/10
Offers a clever architectural modification to LoRA that addresses the 'interference' problem in PEFT without adding significant overhead. The efficiency gains are substantial, though the core concept applies existing MoE ideas to LoRA matrices.
×