Learn More, but Bother Less: Parameter Efficient Continual Learning

Abstract

Large Language Models (LLMs) have demonstrated profound capabilities due to their extensive pre-training on diverse corpora. However, LLMs often struggle with catastrophic forgetting when engaged in sequential task learning. In this paper, we propose a novel parameter-efficient approach for continual learning in LLMs, which empirically investigates knowledge transfer from previously learned tasks to new tasks through low-rank matrix parameters, enhancing the learning of new tasks without significant interference. Our method employs sensitivity-based analysis of low-rank matrix parameters to identify knowledge-specific parameters between sequential tasks, which are used to initialize the low-rank matrix parameters in new tasks. To maintain orthogonality and minimize forgetting, we further involve the gradient projection technique that keeps the low-rank subspaces of each new task orthogonal to those of previous tasks. Our experimental results on continual learning benchmarks validate the efficacy of our proposed method, which outperforms existing state-of-the-art methods in reducing forgetting, enhancing task performance, and preserving the model's ability to generalize to unseen tasks.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Qiao and Mahdavi. "Learn More, but Bother Less: Parameter Efficient Continual Learning." Neural Information Processing Systems, 2024. doi:10.52202/079017-3092

Markdown

[Qiao and Mahdavi. "Learn More, but Bother Less: Parameter Efficient Continual Learning." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/qiao2024neurips-learn/) doi:10.52202/079017-3092

BibTeX

@inproceedings{qiao2024neurips-learn,
  title     = {{Learn More, but Bother Less: Parameter Efficient Continual Learning}},
  author    = {Qiao, Fuli and Mahdavi, Mehrdad},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-3092},
  url       = {https://mlanthology.org/neurips/2024/qiao2024neurips-learn/}
}