Any-Shift Prompting for Generalization over Distributions

Abstract

Image-language models with prompt learning have shown remarkable advances in numerous downstream vision tasks. Nevertheless conventional prompt learning methods overfit the training distribution and lose the generalization ability on the test distributions. To improve the generalization across various distribution shifts we propose any-shift prompting: a general probabilistic inference framework that considers the relationship between training and test distributions during prompt learning. We explicitly connect training and test distributions in the latent space by constructing training and test prompts in a hierarchical architecture. Within this framework the test prompt exploits the distribution relationships to guide the generalization of the CLIP image-language model from training to any test distribution. To effectively encode the distribution information and their relationships we further introduce a transformer inference network with a pseudo-shift training mechanism. The network generates the tailored test prompt with both training and test information in a feedforward pass avoiding extra training costs at test time. Extensive experiments on twenty-three datasets demonstrate the effectiveness of any-shift prompting on the generalization over various distribution shifts.

Cite

Text

Xiao et al. "Any-Shift Prompting for Generalization over Distributions." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.01314

Markdown

[Xiao et al. "Any-Shift Prompting for Generalization over Distributions." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/xiao2024cvpr-anyshift/) doi:10.1109/CVPR52733.2024.01314

BibTeX

@inproceedings{xiao2024cvpr-anyshift,
  title     = {{Any-Shift Prompting for Generalization over Distributions}},
  author    = {Xiao, Zehao and Shen, Jiayi and Derakhshani, Mohammad Mahdi and Liao, Shengcai and Snoek, Cees G. M.},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {13849-13860},
  doi       = {10.1109/CVPR52733.2024.01314},
  url       = {https://mlanthology.org/cvpr/2024/xiao2024cvpr-anyshift/}
}