Predicting Task Forgetting in Large Language Models

Anat Kleiman, Jonathan Frankle, Sham M. Kakade, Mansheej Paul

ICMLW 2023

/icmlw/2023/kleiman2023icmlw-predicting/

Abstract

In this paper, we offer a comprehensive evaluation of forgetting in large language models (LLMs) during sequential learning of finetuning tasks in a pretrained model. We empirically track the degradation of performance across diverse tasks and find that the validation perplexity can be predicted using a linear function, regardless of the specific task, model architecture, or task order. This knowledge sheds light on the dynamics of knowledge acquisition and retention, offering practical implications for managing and mitigating task forgetting in LLM-based systems.

PDF ICMLW OpenReview Semantic Scholar

Cite

Text

Kleiman et al. "Predicting Task Forgetting in Large Language Models." ICML 2023 Workshops: DeployableGenerativeAI, 2023.

Markdown

[Kleiman et al. "Predicting Task Forgetting in Large Language Models." ICML 2023 Workshops: DeployableGenerativeAI, 2023.](https://mlanthology.org/icmlw/2023/kleiman2023icmlw-predicting/)

BibTeX

@inproceedings{kleiman2023icmlw-predicting,
  title     = {{Predicting Task Forgetting in Large Language Models}},
  author    = {Kleiman, Anat and Frankle, Jonathan and Kakade, Sham M. and Paul, Mansheej},
  booktitle = {ICML 2023 Workshops: DeployableGenerativeAI},
  year      = {2023},
  url       = {https://mlanthology.org/icmlw/2023/kleiman2023icmlw-predicting/}
}