Do Pre-Trained Models Benefit Equally in Continual Learning?
Abstract
A large part of the continual learning (CL) literature focuses on developing algorithms for models trained from scratch. While these algorithms work great with from-sc ratch trained models on widely used CL benchmarks, they show dramatic performance drops on more complex datasets (e.g., Split-CUB200). Pre-trained models, widely used to transfer knowledge to downstream tasks, could enhance these methods to be applicable in more realistic scenarios. However, surprisingly, improvements in CL algorithms from pre-training are inconsistent. For instance, while Incremental Classifier and Representation Learning (iCaRL) underperforms Supervised Contrastive Replay (SCR) when trained from scratch, it outperforms SCR when both are initialized with a pre-trained model. This indicates the paradigm current CL literature follows, where all methods are compared in from-scratch training, is not well reflective of the true CL objective and desired progress. Furthermore, we found 1) CL algorithms that exert less regularization benefit more from a pre-trained model; 2) a model pre-trained with a larger dataset (WebImageText in Contrastive Language-Image Pre-training (CLIP) vs. ImageNet) does not guarantee a better improvement. Based on these findings, we introduced a simple yet effective baseline that employs minimum regularization and leverages the more beneficial pre-trained model, which outperforms state-of-the-art methods when pre-training is applied. Our code is available at https://github.com/eric11220/pretrained-models-in-CL.
Cite
Text
Lee et al. "Do Pre-Trained Models Benefit Equally in Continual Learning?." Winter Conference on Applications of Computer Vision, 2023.Markdown
[Lee et al. "Do Pre-Trained Models Benefit Equally in Continual Learning?." Winter Conference on Applications of Computer Vision, 2023.](https://mlanthology.org/wacv/2023/lee2023wacv-pretrained/)BibTeX
@inproceedings{lee2023wacv-pretrained,
title = {{Do Pre-Trained Models Benefit Equally in Continual Learning?}},
author = {Lee, Kuan-Ying and Zhong, Yuanyi and Wang, Yu-Xiong},
booktitle = {Winter Conference on Applications of Computer Vision},
year = {2023},
pages = {6485-6493},
url = {https://mlanthology.org/wacv/2023/lee2023wacv-pretrained/}
}