Parameter Efficient Fine-Tuning of Self-Supervised ViTs Without Catastrophic Forgetting

Abstract

Artificial neural networks often suffer from catastrophic forgetting, where learning new concepts leads to a complete loss of previously acquired knowledge. We observe that this issue is particularly magnified in vision transformers (ViTs), where post-pre-training and fine-tuning on new tasks can significantly degrade the model’s original general abilities. For instance, a DINO ViT-Base/16 pre-trained on ImageNet-1k loses over 70% accuracy on ImageNet-1k after just 10 iterations of fine-tuning on CIFAR-100. Overcoming this stability-plasticity dilemma is crucial for enabling ViTs to continuously learn and adapt to new domains while preserving their initial knowledge. In this work, we study two new parameter-efficient fine-tuning strategies: (1) Block Expansion, and (2) Low-rank adaptation (LoRA). Our experiments reveal that using either Block Expansion or LoRA on self-supervised pre-trained ViTs surpass fully fine-tuned ViTs in new domains while offering significantly greater parameter efficiency. Notably, we find that Block Expansion experiences only a minimal performance drop in the pre-training domain, thereby effectively mitigating catastrophic forgetting in pre-trained ViTs1.

Cite

Text

Bafghi et al. "Parameter Efficient Fine-Tuning of Self-Supervised ViTs Without Catastrophic Forgetting." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024. doi:10.1109/CVPRW63382.2024.00371

Markdown

[Bafghi et al. "Parameter Efficient Fine-Tuning of Self-Supervised ViTs Without Catastrophic Forgetting." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024.](https://mlanthology.org/cvprw/2024/bafghi2024cvprw-parameter/) doi:10.1109/CVPRW63382.2024.00371

BibTeX

@inproceedings{bafghi2024cvprw-parameter,
  title     = {{Parameter Efficient Fine-Tuning of Self-Supervised ViTs Without Catastrophic Forgetting}},
  author    = {Bafghi, Reza Akbarian and Harilal, Nidhin and Monteleoni, Claire and Raissi, Maziar},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2024},
  pages     = {3679-3684},
  doi       = {10.1109/CVPRW63382.2024.00371},
  url       = {https://mlanthology.org/cvprw/2024/bafghi2024cvprw-parameter/}
}