Conformal Prediction for Zero-Shot Models

Abstract

Vision-language models pre-trained at large scale have shown unprecedented adaptability and generalization to downstream tasks. Although its discriminative potential has been widely explored, its reliability and uncertainty are still overlooked. In this work, we investigate the capabilities of CLIP models under the split conformal prediction paradigm, which provides theoretical guarantees to black-box models based on a small, labeled calibration set. In contrast to the main body of literature on conformal predictors in vision classifiers, foundation models exhibit a particular characteristic: they are pre-trained on a one-time basis on an inaccessible source domain, different from the transferred task. This domain drift negatively affects the efficiency of the conformal sets and poses additional challenges. To alleviate this issue, we propose Conf-OT, a transfer learning setting that operates transductive over the combined calibration and query sets. Solving an optimal transport problem, the proposed method bridges the domain gap between pre-training and adaptation without requiring additional data splits but still maintaining coverage guarantees. We comprehensively explore this conformal prediction strategy on a broad span of 15 datasets and three non-conformity scores. Conf-OT provides consistent relative improvements of up to 20% on set efficiency while being 15 times faster than popular transductive approaches.

Cite

Text

Silva-Rodríguez et al. "Conformal Prediction for Zero-Shot Models." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.01856

Markdown

[Silva-Rodríguez et al. "Conformal Prediction for Zero-Shot Models." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/silvarodriguez2025cvpr-conformal/) doi:10.1109/CVPR52734.2025.01856

BibTeX

@inproceedings{silvarodriguez2025cvpr-conformal,
  title     = {{Conformal Prediction for Zero-Shot Models}},
  author    = {Silva-Rodríguez, Julio and Ayed, Ismail Ben and Dolz, Jose},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2025},
  pages     = {19931-19941},
  doi       = {10.1109/CVPR52734.2025.01856},
  url       = {https://mlanthology.org/cvpr/2025/silvarodriguez2025cvpr-conformal/}
}