Towards Federated Low-Rank Adaptation with Rank Heterogeneity
Abstract
Low-rank adaptation (LoRA) is an attractive alternative of adapting full weights for the federated fine-tuning of large pretrained models, which can significantly reduce computational burden. In principle, federated LoRA can provide an effective mean to allocate different resources to each client by tuning ranks for each client. However, we find that the empirical performance of LoRA is highly unstable with respect to such rank-heterogeneity. Our investigation reveals that the root cause of this instability is the zero-padding-based aggregation strategy adopted in conventional federated LoRA frameworks, which causes the information from high rank clients to become diluted during the aggregation process. To address this issue, we propose a new replication-based padding strategy, which allows us to better leverage the information from clients with high-quality datasets. This method ensures that valuable information from high rank clients is retained during the aggregation process, accelerating the convergence speed and enhancing the overall prediction quality of the global model.
Cite
Text
Byun and Lee. "Towards Federated Low-Rank Adaptation with Rank Heterogeneity." NeurIPS 2024 Workshops: AFM, 2024.Markdown
[Byun and Lee. "Towards Federated Low-Rank Adaptation with Rank Heterogeneity." NeurIPS 2024 Workshops: AFM, 2024.](https://mlanthology.org/neuripsw/2024/byun2024neuripsw-federated/)BibTeX
@inproceedings{byun2024neuripsw-federated,
title = {{Towards Federated Low-Rank Adaptation with Rank Heterogeneity}},
author = {Byun, Yuji and Lee, Jaeho},
booktitle = {NeurIPS 2024 Workshops: AFM},
year = {2024},
url = {https://mlanthology.org/neuripsw/2024/byun2024neuripsw-federated/}
}