Generating Training Data with Language Models: Towards Zero-Shot Language Understanding

Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han

NeurIPS 2022

/neurips/2022/meng2022neurips-generating/

Abstract

Pretrained language models (PLMs) have demonstrated remarkable performance in various natural language processing tasks: Unidirectional PLMs (e.g., GPT) are well known for their superior text generation capabilities; bidirectional PLMs (e.g., BERT) have been the prominent choice for natural language understanding (NLU) tasks. While both types of models have achieved promising few-shot learning performance, their potential for zero-shot learning has been underexplored. In this paper, we present a simple approach that uses both types of PLMs for fully zero-shot learning of NLU tasks without requiring any task-specific data: A unidirectional PLM generates class-conditioned texts guided by prompts, which are used as the training data for fine-tuning a bidirectional PLM. With quality training data selected based on the generation probability and regularization techniques (label smoothing and temporal ensembling) applied to the fine-tuning stage for better generalization and stability, our approach demonstrates strong performance across seven classification tasks of the GLUE benchmark (e.g., 72.3/73.8 on MNLI-m/mm and 92.8 on SST-2), significantly outperforming zero-shot prompting methods and achieving even comparable results to strong few-shot approaches using 32 training samples per class.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Meng et al. "Generating Training Data with Language Models: Towards Zero-Shot Language Understanding." Neural Information Processing Systems, 2022.

Markdown

[Meng et al. "Generating Training Data with Language Models: Towards Zero-Shot Language Understanding." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/meng2022neurips-generating/)

BibTeX

@inproceedings{meng2022neurips-generating,
  title     = {{Generating Training Data with Language Models: Towards Zero-Shot Language Understanding}},
  author    = {Meng, Yu and Huang, Jiaxin and Zhang, Yu and Han, Jiawei},
  booktitle = {Neural Information Processing Systems},
  year      = {2022},
  url       = {https://mlanthology.org/neurips/2022/meng2022neurips-generating/}
}