Leveraging Self Weak-Supervision for Improved VLM Performance
Abstract
In this work, we present SelfPrompt, a novel semi-supervised prompt-tuning approach for tuning vision-language models (VLMs) in a semi-supervised learning setup. Existing methods for tuning VLMs in semi-supervised setup struggle with the efficient use of the limited label set budget, the accumulation of noisy pseudo-labels and proper utilization of the unlabelled data. SelfPrompt addresses these challenges by introducing (a) a weakly-supervised sampling technique that selects a diverse and representative labelled set, (b) a cluster-guided pseudo-labelling method that improves pseudo-label accuracy, and (c) a confidence-aware semi-supervised learning module that maximizes the utility of unlabelled data by learning from high- and low-confidence pseudo-labels differently. We conduct extensive evaluations across 13 datasets, significantly surpassing state-of-the-art performance with average improvements of 7.92% in semi-supervised learning using a 2-shot setup. Our detailed ablation studies show the effectiveness of each component.
Cite
Text
Roy and Etemad. "Leveraging Self Weak-Supervision for Improved VLM Performance." NeurIPS 2024 Workshops: AFM, 2024.Markdown
[Roy and Etemad. "Leveraging Self Weak-Supervision for Improved VLM Performance." NeurIPS 2024 Workshops: AFM, 2024.](https://mlanthology.org/neuripsw/2024/roy2024neuripsw-leveraging/)BibTeX
@inproceedings{roy2024neuripsw-leveraging,
title = {{Leveraging Self Weak-Supervision for Improved VLM Performance}},
author = {Roy, Shuvendu and Etemad, Ali},
booktitle = {NeurIPS 2024 Workshops: AFM},
year = {2024},
url = {https://mlanthology.org/neuripsw/2024/roy2024neuripsw-leveraging/}
}