BlackVIP: Black-Box Visual Prompting for Robust Transfer Learning
Abstract
With the surge of large-scale pre-trained models (PTMs), fine-tuning these models to numerous downstream tasks becomes a crucial problem. Consequently, parameter efficient transfer learning (PETL) of large models has grasped huge attention. While recent PETL methods showcase impressive performance, they rely on optimistic assumptions: 1) the entire parameter set of a PTM is available, and 2) a sufficiently large memory capacity for the fine-tuning is equipped. However, in most real-world applications, PTMs are served as a black-box API or proprietary software without explicit parameter accessibility. Besides, it is hard to meet a large memory requirement for modern PTMs. In this work, we propose black-box visual prompting (BlackVIP), which efficiently adapts the PTMs without knowledge about model architectures and parameters. BlackVIP has two components; 1) Coordinator and 2) simultaneous perturbation stochastic approximation with gradient correction (SPSA-GC). The Coordinator designs input-dependent image-shaped visual prompts, which improves few-shot adaptation and robustness on distribution/location shift. SPSA-GC efficiently estimates the gradient of a target model to update Coordinator. Extensive experiments on 16 datasets demonstrate that BlackVIP enables robust adaptation to diverse domains without accessing PTMs' parameters, with minimal memory requirements. Code: https://github.com/changdaeoh/BlackVIP
Cite
Text
Oh et al. "BlackVIP: Black-Box Visual Prompting for Robust Transfer Learning." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.02320Markdown
[Oh et al. "BlackVIP: Black-Box Visual Prompting for Robust Transfer Learning." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/oh2023cvpr-blackvip/) doi:10.1109/CVPR52729.2023.02320BibTeX
@inproceedings{oh2023cvpr-blackvip,
title = {{BlackVIP: Black-Box Visual Prompting for Robust Transfer Learning}},
author = {Oh, Changdae and Hwang, Hyeji and Lee, Hee-young and Lim, YongTaek and Jung, Geunyoung and Jung, Jiyoung and Choi, Hosik and Song, Kyungwoo},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2023},
pages = {24224-24235},
doi = {10.1109/CVPR52729.2023.02320},
url = {https://mlanthology.org/cvpr/2023/oh2023cvpr-blackvip/}
}