Insights into Pre-Training via Simpler Synthetic Tasks

Abstract

Pre-training produces representations that are effective for a wide range of downstream tasks, but it is still unclear what properties of pre-training are necessary for effective gains. Notably, recent work shows that even pre-training on synthetic tasks can achieve significant gains in downstream tasks. In this work, we perform three experiments that iteratively simplify pre-training and show that the simplifications still retain much of its gains. First, building on prior work, we perform a systematic evaluation of three existing synthetic pre-training methods on six downstream tasks. We find the best synthetic pre-training method, LIME, attains an average of $67\%$ of the benefits of natural pre-training. Second, to our surprise, we find that pre-training on a simple and generic synthetic task defined by the set function achieves $65\%$ of the benefits, almost matching LIME. Third, we find that $39\%$ of the benefits can be attained by using merely the parameter statistics of synthetic pre-training. We release the source code at \url{https://github.com/felixzli/synthetic_pretraining}.

Cite

Text

Wu et al. "Insights into Pre-Training via Simpler Synthetic Tasks." Neural Information Processing Systems, 2022.

Markdown

[Wu et al. "Insights into Pre-Training via Simpler Synthetic Tasks." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/wu2022neurips-insights/)

BibTeX

@inproceedings{wu2022neurips-insights,
  title     = {{Insights into Pre-Training via Simpler Synthetic Tasks}},
  author    = {Wu, Yuhuai and Li, Felix and Liang, Percy},
  booktitle = {Neural Information Processing Systems},
  year      = {2022},
  url       = {https://mlanthology.org/neurips/2022/wu2022neurips-insights/}
}