How Well Do Contrastively Trained Models Transfer?

Abstract

There are two prevailing methods for pre-training on large datasets to learn transferable representations: 1) supervised pre-training on large but weakly-labeled datasets; 2) contrastively training on image only and image, text pairs. While supervised pre-training learns good representations that can be transferred to a wide range of tasks, contrastively models such as CLIP have demonstrated unprecedented zero-shot transfer. In this work, we compare the transferability of the two aforementioned methods to multiple downstream tasks. The pre-training distributions we consider include YFCC, Conceptual Captions, and ImageNet-21K while pre-training objectives range from supervised to SimCLR, CLIP, and SLIP. We observe that different pre-training methods with the same training source transfer similarly given their ImageNet accuracy.

Cite

Text

Shariatnia et al. "How Well Do Contrastively Trained Models Transfer?." ICML 2022 Workshops: Pre-Training, 2022.

Markdown

[Shariatnia et al. "How Well Do Contrastively Trained Models Transfer?." ICML 2022 Workshops: Pre-Training, 2022.](https://mlanthology.org/icmlw/2022/shariatnia2022icmlw-well/)

BibTeX

@inproceedings{shariatnia2022icmlw-well,
  title     = {{How Well Do Contrastively Trained Models Transfer?}},
  author    = {Shariatnia, M. Moein and Entezari, Rahim and Wortsman, Mitchell and Saukh, Olga and Schmidt, Ludwig},
  booktitle = {ICML 2022 Workshops: Pre-Training},
  year      = {2022},
  url       = {https://mlanthology.org/icmlw/2022/shariatnia2022icmlw-well/}
}