Diversity-Aware Meta Visual Prompting

Abstract

We present Diversity-Aware Meta Visual Prompting (DAM-VP), an efficient and effective prompting method for transferring pre-trained models to downstream tasks with frozen backbone. A challenging issue in visual prompting is that image datasets sometimes have a large data diversity whereas a per-dataset generic prompt can hardly handle the complex distribution shift toward the original pretraining data distribution properly. To address this issue, we propose a dataset Diversity-Aware prompting strategy whose initialization is realized by a Meta-prompt. Specifically, we cluster the downstream dataset into small homogeneity subsets in a diversity-adaptive way, with each subset has its own prompt optimized separately. Such a divide-and-conquer design reduces the optimization difficulty greatly and significantly boosts the prompting performance. Furthermore, all the prompts are initialized with a meta-prompt, which is learned across several datasets. It is a bootstrapped paradigm, with the key observation that the prompting knowledge learned from previous datasets could help the prompt to converge faster and perform better on a new dataset. During inference, we dynamically select a proper prompt for each input, based on the feature distance between the input and each subset. Through extensive experiments, our DAM-VP demonstrates superior efficiency and effectiveness, clearly surpassing previous prompting methods in a series of downstream datasets for different pretraining models. Our code is available at: https://github.com/shikiw/DAM-VP.

Cite

Text

Huang et al. "Diversity-Aware Meta Visual Prompting." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.01047

Markdown

[Huang et al. "Diversity-Aware Meta Visual Prompting." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/huang2023cvpr-diversityaware/) doi:10.1109/CVPR52729.2023.01047

BibTeX

@inproceedings{huang2023cvpr-diversityaware,
  title     = {{Diversity-Aware Meta Visual Prompting}},
  author    = {Huang, Qidong and Dong, Xiaoyi and Chen, Dongdong and Zhang, Weiming and Wang, Feifei and Hua, Gang and Yu, Nenghai},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2023},
  pages     = {10878-10887},
  doi       = {10.1109/CVPR52729.2023.01047},
  url       = {https://mlanthology.org/cvpr/2023/huang2023cvpr-diversityaware/}
}