Learn More, but Bother Less: Parameter Efficient Continual Learning

Abstract

Large Language Models (LLMs) have demonstrated profound capabilities due to their extensive pre-training on diverse corpora. However, LLMs often struggle with catastrophic forgetting when engaged in sequential task learning. In this paper, we propose a novel parameter-efficient approach for continual learning in LLMs, which empirically investigates knowledge transfer from previously learned tasks to new tasks through low-rank matrix parameters, enhancing the learning of new tasks without significant interference. Our method employs sensitivity-based analysis of low-rank matrix parameters to identify knowledge-specific parameters between sequential tasks, which are used to initialize the low-rank matrix parameters in new tasks. To maintain orthogonality and minimize forgetting, we further involve the gradient projection technique that keeps the low-rank subspaces of each new task orthogonal to those of previous tasks. Our experimental results on continual learning benchmarks validate the efficacy of our proposed method, which outperforms existing state-of-the-art methods in reducing forgetting, enhancing task performance, and preserving the model's ability to generalize to unseen tasks.

Cite

Text

Qiao and Mahdavi. "Learn More, but Bother Less: Parameter Efficient Continual Learning." Neural Information Processing Systems, 2024. doi:10.52202/079017-3092

Markdown

[Qiao and Mahdavi. "Learn More, but Bother Less: Parameter Efficient Continual Learning." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/qiao2024neurips-learn/) doi:10.52202/079017-3092

BibTeX

@inproceedings{qiao2024neurips-learn,
  title     = {{Learn More, but Bother Less: Parameter Efficient Continual Learning}},
  author    = {Qiao, Fuli and Mahdavi, Mehrdad},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-3092},
  url       = {https://mlanthology.org/neurips/2024/qiao2024neurips-learn/}
}