The Trade-Off Between Universality and Label Efficiency of Representations from Contrastive Learning

Abstract

Pre-training representations (a.k.a. foundation models) has recently become a prevalent learning paradigm, where one first pre-trains a representation using large-scale unlabeled data, and then learns simple predictors on top of the representation using small labeled data from the downstream tasks. There are two key desiderata for the representation: label efficiency (the ability to learn an accurate classifier on top of the representation with a small amount of labeled data) and universality (usefulness across a wide range of downstream tasks). In this paper, we focus on one of the most popular instantiations of this paradigm: contrastive learning with linear probing, i.e., learning a linear predictor on the representation pre-trained by contrastive learning. We show that there exists a trade-off between the two desiderata so that one may not be able to achieve both simultaneously. Specifically, we provide analysis using a theoretical data model and show that, while more diverse pre-training data result in more diverse features for different tasks (improving universality), it puts less emphasis on task-specific features, giving rise to larger sample complexity for down-stream supervised tasks, and thus worse prediction performance. Guided by this analysis, we propose a contrastive regularization method to improve the trade-off. We validate our analysis and method empirically with systematic experiments using real-world datasets and foundation models.

Cite

Text

Shi et al. "The Trade-Off Between Universality and Label Efficiency of Representations from Contrastive Learning." International Conference on Learning Representations, 2023.

Markdown

[Shi et al. "The Trade-Off Between Universality and Label Efficiency of Representations from Contrastive Learning." International Conference on Learning Representations, 2023.](https://mlanthology.org/iclr/2023/shi2023iclr-tradeoff/)

BibTeX

@inproceedings{shi2023iclr-tradeoff,
  title     = {{The Trade-Off Between Universality and Label Efficiency of Representations from Contrastive Learning}},
  author    = {Shi, Zhenmei and Chen, Jiefeng and Li, Kunyang and Raghuram, Jayaram and Wu, Xi and Liang, Yingyu and Jha, Somesh},
  booktitle = {International Conference on Learning Representations},
  year      = {2023},
  url       = {https://mlanthology.org/iclr/2023/shi2023iclr-tradeoff/}
}