SelfPrompt: Confidence-Aware Semi-Supervised Tuning for Improved Vision-Language Model Adaptation
Abstract
We present SelfPrompt, a novel prompt-tuning approach for vision-language models (VLMs) in a semi-supervised learning setup. Existing methods for tuning VLMs in semi-supervised setups struggle with the negative impact of the miscalibrated VLMs on pseudo-labelling, and the accumulation of noisy pseudo-labels. SelfPrompt addresses these challenges by introducing a cluster-guided pseudo-labelling method that improves pseudo-label accuracy, and a confidence-aware semi-supervised learning module that maximizes the utilization of unlabelled data by combining supervised learning and weakly-supervised learning. Additionally, we investigate our method in an active semi-supervised learning setup, where the labelled set is strategically selected to ensure the best utilization of a limited labelling budget. To this end, we propose a weakly-supervised sampling technique that selects a diverse and representative labelled set, which can be seamlessly integrated into existing methods to enhance their performance. We conduct extensive evaluations across 13 datasets, significantly surpassing state-of-the-art performances with average improvements of 6.23% in standard semi-supervised learning, 6.25% in active semi-supervised learning, and 4.9% in base-to-novel generalization, using a 2-shot setup. Furthermore, SelfPrompt shows excellent generalization in single-shot settings, achieving an average improvement of 11.78%.
Cite
Text
Roy and Etemad. "SelfPrompt: Confidence-Aware Semi-Supervised Tuning for Improved Vision-Language Model Adaptation." Transactions on Machine Learning Research, 2026.Markdown
[Roy and Etemad. "SelfPrompt: Confidence-Aware Semi-Supervised Tuning for Improved Vision-Language Model Adaptation." Transactions on Machine Learning Research, 2026.](https://mlanthology.org/tmlr/2026/roy2026tmlr-selfprompt/)BibTeX
@article{roy2026tmlr-selfprompt,
title = {{SelfPrompt: Confidence-Aware Semi-Supervised Tuning for Improved Vision-Language Model Adaptation}},
author = {Roy, Shuvendu and Etemad, Ali},
journal = {Transactions on Machine Learning Research},
year = {2026},
url = {https://mlanthology.org/tmlr/2026/roy2026tmlr-selfprompt/}
}