The Trade-Off Between Label Efficiency and Universality of Representations from Contrastive Learning
Abstract
The pre-train representation learning paradigm is a recent popular approach to address distribution shift and limitations in training data. This approach first pre-trains a representation function using large unlabeled datasets from multiple tasks by self-supervised (e.g., contrastive) learning, and then learns a simple classifier on the representation using small labeled datasets from the downstream target tasks. The representation should have two key properties: label efficiency (i.e., ability to learn an accurate classifier with a small amount of labeled data) and universality (i.e., usefulness across a wide range of downstream tasks). In this paper, we focus on contrastive learning and systematically study the trade-off between label efficiency and universality both theoretically and empirically. We empirically show that this trade-off exists in different models and datasets. Theoretically, we propose a data model with a hidden representation and provide analysis in a simplified linear setting. Our analysis shows that compared to pre-training on the target task, pre-training on diverse tasks leads to a larger sample complexity for learning the optimal classifier, and thus has worse prediction performance.
Cite
Text
Shi et al. "The Trade-Off Between Label Efficiency and Universality of Representations from Contrastive Learning." ICML 2022 Workshops: Pre-Training, 2022.Markdown
[Shi et al. "The Trade-Off Between Label Efficiency and Universality of Representations from Contrastive Learning." ICML 2022 Workshops: Pre-Training, 2022.](https://mlanthology.org/icmlw/2022/shi2022icmlw-tradeoff/)BibTeX
@inproceedings{shi2022icmlw-tradeoff,
title = {{The Trade-Off Between Label Efficiency and Universality of Representations from Contrastive Learning}},
author = {Shi, Zhenmei and Chen, Jiefeng and Li, Kunyang and Raghuram, Jayaram and Wu, Xi and Liang, Yingyu and Jha, Somesh},
booktitle = {ICML 2022 Workshops: Pre-Training},
year = {2022},
url = {https://mlanthology.org/icmlw/2022/shi2022icmlw-tradeoff/}
}