Mind the Interference: Retaining Pre-Trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models

Abstract

This study addresses the Domain-Class Incremental Learning problem, a realistic but challenging continual learning scenario where both the domain distribution and target classes vary across tasks. To handle these diverse tasks, pre-trained Vision-Language Models (VLMs) are introduced for their strong generalizability. However, this incurs a new problem: the knowledge encoded in the pre-trained VLMs may be disturbed when adapting to new tasks, compromising their inherent zero-shot ability. Existing methods tackle it by tuning VLMs with knowledge distillation on extra datasets, which demands heavy computation overhead. To address this problem efficiently, we propose the Distribution-aware Interference-free Knowledge Integration (DIKI) framework, retaining pre-trained knowledge of VLMs from a perspective of avoiding information interference. Specifically, we design a fully residual mechanism to infuse newly learned knowledge into a frozen backbone, while introducing minimal adverse impacts on pre-trained knowledge. Besides, this residual property enables our distribution-aware integration calibration scheme, explicitly controlling the information implantation process for test data from unseen distributions. Experiments demonstrate that our DIKI surpasses the current state-of-the-art approach using only 0.86% of the trained parameters and requiring substantially less training time. Code is available at: https://github.com/lloongx/DIKI.

Cite

Text

Tang et al. "Mind the Interference: Retaining Pre-Trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72764-1_20

Markdown

[Tang et al. "Mind the Interference: Retaining Pre-Trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/tang2024eccv-mind/) doi:10.1007/978-3-031-72764-1_20

BibTeX

@inproceedings{tang2024eccv-mind,
  title     = {{Mind the Interference: Retaining Pre-Trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models}},
  author    = {Tang, Longxiang and Tian, Zhuotao and Li, Kai and He, Chunming and Zhou, Hantao and Zhao, Hengshuang and Li, Xiu and Jia, Jiaya},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-72764-1_20},
  url       = {https://mlanthology.org/eccv/2024/tang2024eccv-mind/}
}