Do Pre-Trained Models Benefit Equally in Continual Learning?

Abstract

A large part of the continual learning (CL) literature focuses on developing algorithms for models trained from scratch. While these algorithms work great with from-sc ratch trained models on widely used CL benchmarks, they show dramatic performance drops on more complex datasets (e.g., Split-CUB200). Pre-trained models, widely used to transfer knowledge to downstream tasks, could enhance these methods to be applicable in more realistic scenarios. However, surprisingly, improvements in CL algorithms from pre-training are inconsistent. For instance, while Incremental Classifier and Representation Learning (iCaRL) underperforms Supervised Contrastive Replay (SCR) when trained from scratch, it outperforms SCR when both are initialized with a pre-trained model. This indicates the paradigm current CL literature follows, where all methods are compared in from-scratch training, is not well reflective of the true CL objective and desired progress. Furthermore, we found 1) CL algorithms that exert less regularization benefit more from a pre-trained model; 2) a model pre-trained with a larger dataset (WebImageText in Contrastive Language-Image Pre-training (CLIP) vs. ImageNet) does not guarantee a better improvement. Based on these findings, we introduced a simple yet effective baseline that employs minimum regularization and leverages the more beneficial pre-trained model, which outperforms state-of-the-art methods when pre-training is applied. Our code is available at https://github.com/eric11220/pretrained-models-in-CL.

Cite

Text

Lee et al. "Do Pre-Trained Models Benefit Equally in Continual Learning?." Winter Conference on Applications of Computer Vision, 2023.

Markdown

[Lee et al. "Do Pre-Trained Models Benefit Equally in Continual Learning?." Winter Conference on Applications of Computer Vision, 2023.](https://mlanthology.org/wacv/2023/lee2023wacv-pretrained/)

BibTeX

@inproceedings{lee2023wacv-pretrained,
  title     = {{Do Pre-Trained Models Benefit Equally in Continual Learning?}},
  author    = {Lee, Kuan-Ying and Zhong, Yuanyi and Wang, Yu-Xiong},
  booktitle = {Winter Conference on Applications of Computer Vision},
  year      = {2023},
  pages     = {6485-6493},
  url       = {https://mlanthology.org/wacv/2023/lee2023wacv-pretrained/}
}