The Expressive Power of Low-Rank Adaptation
Abstract
*Low-Rank Adaptation* (LoRA), a parameter-efficient fine-tuning method that leverages low-rank adaptation of weight matrices, has emerged as a prevalent technique for fine-tuning pre-trained models such as large language models and diffusion models. Despite its huge success in practice, the theoretical underpinnings of LoRA have largely remained unexplored. This paper takes the first step to bridge this gap by theoretically analyzing the expressive power of LoRA. We prove that, for fully connected neural networks, LoRA can adapt any model $f$ to accurately represent any smaller target model $\overline{f}$ if LoRA-rank $\geq(\text{width of }f) \times \frac{\text{depth of }\overline{f}}{\text{depth of }f}$. We also quantify the approximation error when LoRA-rank is lower than the threshold. For Transformer networks, we show any model can be adapted to a target model of the same size with rank-$(\frac{\text{embedding size}}{2})$ LoRA adapters.
Cite
Text
Zeng and Lee. "The Expressive Power of Low-Rank Adaptation." NeurIPS 2023 Workshops: OPT, 2023.Markdown
[Zeng and Lee. "The Expressive Power of Low-Rank Adaptation." NeurIPS 2023 Workshops: OPT, 2023.](https://mlanthology.org/neuripsw/2023/zeng2023neuripsw-expressive/)BibTeX
@inproceedings{zeng2023neuripsw-expressive,
title = {{The Expressive Power of Low-Rank Adaptation}},
author = {Zeng, Yuchen and Lee, Kangwook},
booktitle = {NeurIPS 2023 Workshops: OPT},
year = {2023},
url = {https://mlanthology.org/neuripsw/2023/zeng2023neuripsw-expressive/}
}