Memory Efficient Continual Learning with CLIP Models

Abstract

Contrastive Language-Image Pretraining (CLIP) models excel at understanding image-text relationships but struggle with adapting to new data without forgetting prior knowledge. To address this, models are typically fine-tuned using both new task data and a memory buffer of past tasks. However, CLIP's contrastive loss suffers when the memory buffer is small, leading to performance degradation on previous tasks. We propose a memory-efficient, distributionally robust method that dynamically reweights losses per class during training. Our approach, tested on class incremental settings (CIFAR-100, ImageNet1K) and a domain incremental setting (DomainNet) adapts CLIP models quickly while minimizing catastrophic forgetting, even with minimal memory usage.

Cite

Text

King et al. "Memory Efficient Continual Learning with CLIP Models." NeurIPS 2024 Workshops: AFM, 2024.

Markdown

[King et al. "Memory Efficient Continual Learning with CLIP Models." NeurIPS 2024 Workshops: AFM, 2024.](https://mlanthology.org/neuripsw/2024/king2024neuripsw-memory/)

BibTeX

@inproceedings{king2024neuripsw-memory,
  title     = {{Memory Efficient Continual Learning with CLIP Models}},
  author    = {King, Ryan and Li, Gang and Mortazavi, Bobak J and Yang, Tianbao},
  booktitle = {NeurIPS 2024 Workshops: AFM},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/king2024neuripsw-memory/}
}