SelfPrompt: Confidence-Aware Semi-Supervised Tuning for Improved Vision-Language Model Adaptation

Abstract

We present SelfPrompt, a novel prompt-tuning approach for vision-language models (VLMs) in a semi-supervised learning setup. Existing methods for tuning VLMs in semi-supervised setups struggle with the negative impact of the miscalibrated VLMs on pseudo-labelling, and the accumulation of noisy pseudo-labels. SelfPrompt addresses these challenges by introducing a cluster-guided pseudo-labelling method that improves pseudo-label accuracy, and a confidence-aware semi-supervised learning module that maximizes the utilization of unlabelled data by combining supervised learning and weakly-supervised learning. Additionally, we investigate our method in an active semi-supervised learning setup, where the labelled set is strategically selected to ensure the best utilization of a limited labelling budget. To this end, we propose a weakly-supervised sampling technique that selects a diverse and representative labelled set, which can be seamlessly integrated into existing methods to enhance their performance. We conduct extensive evaluations across 13 datasets, significantly surpassing state-of-the-art performances with average improvements of 6.23% in standard semi-supervised learning, 6.25% in active semi-supervised learning, and 4.9% in base-to-novel generalization, using a 2-shot setup. Furthermore, SelfPrompt shows excellent generalization in single-shot settings, achieving an average improvement of 11.78%.

Cite

Text

Roy and Etemad. "SelfPrompt: Confidence-Aware Semi-Supervised Tuning for Improved Vision-Language Model Adaptation." Transactions on Machine Learning Research, 2026.

Markdown

[Roy and Etemad. "SelfPrompt: Confidence-Aware Semi-Supervised Tuning for Improved Vision-Language Model Adaptation." Transactions on Machine Learning Research, 2026.](https://mlanthology.org/tmlr/2026/roy2026tmlr-selfprompt/)

BibTeX

@article{roy2026tmlr-selfprompt,
  title     = {{SelfPrompt: Confidence-Aware Semi-Supervised Tuning for Improved Vision-Language Model Adaptation}},
  author    = {Roy, Shuvendu and Etemad, Ali},
  journal   = {Transactions on Machine Learning Research},
  year      = {2026},
  url       = {https://mlanthology.org/tmlr/2026/roy2026tmlr-selfprompt/}
}