LoRA Without Forgetting: Freezing and Sparse Masking for Low-Rank Adaptation

Abstract

Existing parameter-efficient fine-tuning (PEFT) methods for large language models (LLMs), such as LoRA, alleviate the computational burden but still introduce redundant trainable parameters and remain susceptible to knowledge degradation when fine-tuned sequentially. In this work, we propose LoRA without Forgetting (LoRAF), a novel PEFT method that reduces trainable parameters while mitigating catastrophic forgetting. LoRAF achieves this by freezing the low-rank matrix $A$ and applying sparse, task-specific masks to the low-rank matrix $B$. To prevent interference between tasks, LoRAF enforces non-overlapping masks across different tasks. We evaluate LoRAF on natural language understanding and mathematical reasoning tasks using Mistral-7B. Our results demonstrate that LoRAF outperforms full fine-tuning (FFT) and LoRA while using 95\% fewer trainable parameters than LoRA. In a sequential learning setting, LoRAF significantly outperforms both LoRA and FFT in mitigating catastrophic forgetting.

Cite

Text

Zhang et al. "LoRA Without Forgetting: Freezing and Sparse Masking for Low-Rank Adaptation." ICLR 2025 Workshops: SLLM, 2025.

Markdown

[Zhang et al. "LoRA Without Forgetting: Freezing and Sparse Masking for Low-Rank Adaptation." ICLR 2025 Workshops: SLLM, 2025.](https://mlanthology.org/iclrw/2025/zhang2025iclrw-lora/)

BibTeX

@inproceedings{zhang2025iclrw-lora,
  title     = {{LoRA Without Forgetting: Freezing and Sparse Masking for Low-Rank Adaptation}},
  author    = {Zhang, Juzheng and You, Jiacheng and Panda, Ashwinee and Goldstein, Tom},
  booktitle = {ICLR 2025 Workshops: SLLM},
  year      = {2025},
  url       = {https://mlanthology.org/iclrw/2025/zhang2025iclrw-lora/}
}