Improved Visual Fine-Tuning with Natural Language Supervision

Abstract

Fine-tuning a visual pre-trained model can leverage the semantic information from large-scale pre-training data and mitigate the over-fitting problem on downstream vision tasks with limited training examples. While the problem of catastrophic forgetting in pre-trained backbone has been extensively studied for fine-tuning, its potential bias from the corresponding pre-training task and data, attracts less attention. In this work, we investigate this problem by demonstrating that the obtained classifier after fine-tuning will be close to that induced by the pre-trained model. To reduce the bias in the classifier effectively, we introduce a reference distribution obtained from a fixed text classifier, which can help regularize the learned vision classifier. The proposed method, Text Supervised fine-tuning (TeS), is evaluated with diverse pre-trained vision models including ResNet and ViT, and text encoders including BERT and CLIP, on 11 downstream tasks. The consistent improvement with a clear margin over distinct scenarios confirms the effectiveness of our proposal. Code is available at https://github.com/idstcv/TeS.

Cite

Text

Wang et al. "Improved Visual Fine-Tuning with Natural Language Supervision." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.01093

Markdown

[Wang et al. "Improved Visual Fine-Tuning with Natural Language Supervision." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/wang2023iccv-improved/) doi:10.1109/ICCV51070.2023.01093

BibTeX

@inproceedings{wang2023iccv-improved,
  title     = {{Improved Visual Fine-Tuning with Natural Language Supervision}},
  author    = {Wang, Junyang and Xu, Yuanhong and Hu, Juhua and Yan, Ming and Sang, Jitao and Qian, Qi},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {11899-11909},
  doi       = {10.1109/ICCV51070.2023.01093},
  url       = {https://mlanthology.org/iccv/2023/wang2023iccv-improved/}
}