Sparsity for Communication-Efficient LoRA

Abstract

Recently, several works have used unstructured pruning to augment adapter methods. However, these ``sparse adapter'' methods have limited communication benefits in federated learning. In this work, we propose a simple baseline which combines LoRA with a constant sparsity during communication only. On three FL image and text tasks, our method reduces communication costs by up to $10\times$ over vanilla (dense) LoRA and up to $5\times$ over more complex sparse LoRA baselines. Our work highlights the importance of considering system-specific constraints when developing efficient fine-tuning approaches, and serves as a competitive baseline for future work in federated fine-tuning.

Cite

Text

Kuo et al. "Sparsity for Communication-Efficient LoRA." ICLR 2024 Workshops: PML4LRS, 2024.

Markdown

[Kuo et al. "Sparsity for Communication-Efficient LoRA." ICLR 2024 Workshops: PML4LRS, 2024.](https://mlanthology.org/iclrw/2024/kuo2024iclrw-sparsity/)

BibTeX

@inproceedings{kuo2024iclrw-sparsity,
  title     = {{Sparsity for Communication-Efficient LoRA}},
  author    = {Kuo, Kevin and Raje, Arian and Rajesh, Kousik and Smith, Virginia},
  booktitle = {ICLR 2024 Workshops: PML4LRS},
  year      = {2024},
  url       = {https://mlanthology.org/iclrw/2024/kuo2024iclrw-sparsity/}
}