Black-Box Test-Time Prompt Tuning for Vision-Language Models

Abstract

Test-time prompt tuning (TPT) aims to adjust the vision-language models (e.g., CLIP) with learnable prompts during the inference phase. However, previous works overlooked that pre-trained models as a service (MaaS) have become a noticeable trend due to their commercial usage and potential risk of misuse. In the context of MaaS, users can only design prompts in inputs and query the black-box vision-language models through inference APIs, rendering the previous paradigm of utilizing gradient for prompt tuning is infeasible. In this paper, we propose black-box test-time prompt tuning (B²TPT), a novel framework that addresses the challenge of optimizing prompts without gradients in an unsupervised manner. Specifically, B²TPT designs a consistent or confident (CoC) pseudo-labeling strategy to generate high-quality pseudo-labels from the outputs. Subsequently, we propose to optimize low-dimensional intrinsic prompts using a derivative-free evolution algorithm and to project them onto the original text and vision prompts. This strategy addresses the gradient-free challenge while reducing complexity. Extensive experiments across 15 datasets demonstrate the superiority of B²TPT. The results show that B²TPT not only outperforms CLIP's zero-shot inference at test time, but also surpasses other gradient-based TPT methods.

Cite

Text

Meng et al. "Black-Box Test-Time Prompt Tuning for Vision-Language Models." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I6.32652

Markdown

[Meng et al. "Black-Box Test-Time Prompt Tuning for Vision-Language Models." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/meng2025aaai-black/) doi:10.1609/AAAI.V39I6.32652

BibTeX

@inproceedings{meng2025aaai-black,
  title     = {{Black-Box Test-Time Prompt Tuning for Vision-Language Models}},
  author    = {Meng, Fan'an and Cui, Chaoran and Dai, Hongjun and Gong, Shuai},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {6099-6107},
  doi       = {10.1609/AAAI.V39I6.32652},
  url       = {https://mlanthology.org/aaai/2025/meng2025aaai-black/}
}