Chain of LoRA: Efficient Fine-Tuning of Language Models via Residual Learning
Abstract
Fine-tuning is the primary methodology for tailoring pre-trained large language models to specific tasks. As the model's scale and the diversity of tasks expand, parameter-efficient fine-tuning methods are of paramount importance. One of the most widely used family of methods is low-rank adaptation (LoRA) and its variants. LoRA encodes weight update as the product of two low-rank matrices. Despite its advantages, LoRA falls short of full-parameter fine-tuning in terms of generalization error for certain tasks. We introduce Chain of LoRA (COLA), an iterative optimization framework inspired by the Frank-Wolfe algorithm, to bridge the gap between LoRA and full parameter fine-tuning, without incurring additional computational costs or memory overheads. COLA employs a residual learning procedure where it merges learned LoRA modules into the pre-trained language model parameters and re-initialize optimization for new born LoRA modules. We provide theoretical convergence guarantees as well as empirical results to validate the effectiveness of our algorithm. Across various models (OPT and Llama-2) and 11 benchmarking tasks, we demonstrate that COLA can consistently outperform LoRA without additional computational or memory costs.
Cite
Text
Xia et al. "Chain of LoRA: Efficient Fine-Tuning of Language Models via Residual Learning." ICML 2024 Workshops: LLMs_and_Cognition, 2024.Markdown
[Xia et al. "Chain of LoRA: Efficient Fine-Tuning of Language Models via Residual Learning." ICML 2024 Workshops: LLMs_and_Cognition, 2024.](https://mlanthology.org/icmlw/2024/xia2024icmlw-chain/)BibTeX
@inproceedings{xia2024icmlw-chain,
title = {{Chain of LoRA: Efficient Fine-Tuning of Language Models via Residual Learning}},
author = {Xia, Wenhan and Qin, Chengwei and Hazan, Elad},
booktitle = {ICML 2024 Workshops: LLMs_and_Cognition},
year = {2024},
url = {https://mlanthology.org/icmlw/2024/xia2024icmlw-chain/}
}