LoRA Learns Less and Forgets Less

Biderman, Dan; Portes, Jacob; Ortiz, Jose Javier Gonzalez; Paul, Mansheej; Greengard, Philip; Jennings, Connor; King, Daniel; Havens, Sam; Chiley, Vitaliy; Frankle, Jonathan; Blakeney, Cody; Cunningham, John Patrick

LoRA Learns Less and Forgets Less

Dan Biderman, Jacob Portes, Jose Javier Gonzalez Ortiz, Mansheej Paul, Philip Greengard, Connor Jennings, Daniel King, Sam Havens, Vitaliy Chiley, Jonathan Frankle, Cody Blakeney, John Patrick Cunningham

TMLR 2024

/tmlr/2024/biderman2024tmlr-lora/

Abstract

Low-Rank Adaptation (LoRA) is a widely-used parameter-efficient finetuning method for large language models. LoRA saves memory by training only low rank perturbations to selected weight matrices. In this work, we compare the performance of LoRA and full finetuning on two target domains, programming and mathematics. We consider both the instruction finetuning ($\approx$100K prompt-response pairs) and continued pretraining ($\approx$20B unstructured tokens) data regimes. Our results show that, in the standard low-rank settings, LoRA substantially underperforms full finetuning. Nevertheless, LoRA better maintains the base model's performance on tasks outside the target domain. We show that LoRA mitigates forgetting more than common regularization techniques such as weight decay and dropout; it also helps maintain more diverse generations. Finally, we show that full finetuning learns perturbations with a rank that is 10-100$\times$ greater than typical LoRA configurations, possibly explaining some of the reported gaps. We conclude by proposing best practices for finetuning with LoRA.

PDF TMLR Code Semantic Scholar

Cite

Text

Biderman et al. "LoRA Learns Less and Forgets Less." Transactions on Machine Learning Research, 2024.

Markdown

[Biderman et al. "LoRA Learns Less and Forgets Less." Transactions on Machine Learning Research, 2024.](https://mlanthology.org/tmlr/2024/biderman2024tmlr-lora/)

BibTeX

@article{biderman2024tmlr-lora,
  title     = {{LoRA Learns Less and Forgets Less}},
  author    = {Biderman, Dan and Portes, Jacob and Ortiz, Jose Javier Gonzalez and Paul, Mansheej and Greengard, Philip and Jennings, Connor and King, Daniel and Havens, Sam and Chiley, Vitaliy and Frankle, Jonathan and Blakeney, Cody and Cunningham, John Patrick},
  journal   = {Transactions on Machine Learning Research},
  year      = {2024},
  url       = {https://mlanthology.org/tmlr/2024/biderman2024tmlr-lora/}
}