The Role of Pre-Training Data in Transfer Learning

Abstract

We explore which pre-training dataset should be used to achieve the best transfer learning performance. We investigate the impact of pre-training on the few-shot and full fine-tuning performance using 7 pre-training datasets, and 9 downstream datasets. Through extensive controlled experiments, we find that the choice of the pre-training dataset is essential for the few-shot transfer, but its role decreases as more data is made available for fine-tuning. Additionally, we explore the role of data curation and examine the trade-offs between label noise and the size of the pre-training dataset. We find that using 2000× more pre-training data from LAION can match the performance of supervised ImageNet pre-training.

Cite

Text

Entezari et al. "The Role of Pre-Training Data in Transfer Learning." ICLR 2023 Workshops: MRL, 2023.

Markdown

[Entezari et al. "The Role of Pre-Training Data in Transfer Learning." ICLR 2023 Workshops: MRL, 2023.](https://mlanthology.org/iclrw/2023/entezari2023iclrw-role/)

BibTeX

@inproceedings{entezari2023iclrw-role,
  title     = {{The Role of Pre-Training Data in Transfer Learning}},
  author    = {Entezari, Rahim and Wortsman, Mitchell and Saukh, Olga and Shariatnia, M. Moein and Sedghi, Hanie and Schmidt, Ludwig},
  booktitle = {ICLR 2023 Workshops: MRL},
  year      = {2023},
  url       = {https://mlanthology.org/iclrw/2023/entezari2023iclrw-role/}
}