Multitask Prompted Training Enables Zero-Shot Task Generalization

Abstract

Large language models have recently been shown to attain reasonable zero-shot generalization on a diverse set of tasks (Brown et al., 2020). It has been hypothesized that this is a consequence of implicit multitask learning in language models’ pretraining (Radford et al., 2019). Can zero-shot generalization instead be directly induced by explicit multitask learning? To test this question at scale, we develop a system for easily mapping any natural language tasks into a human-readable prompted form. We convert a large set of supervised datasets, each with multiple prompts with diverse wording. These prompted datasets allow for benchmarking the ability of a model to perform completely unseen tasks specified in natural language. We fine-tune a pretrained encoder-decoder model (Raffel et al., 2020; Lester et al., 2021) on this multitask mixture covering a wide variety of tasks. The model attains strong zero-shot performance on several datasets, often outperforming models 16× its size. Further, our model attains strong performance on a subset of tasks from the BIG-Bench benchmark, outperforming models 6× its size. All trained models are available at https://github.com/bigscience-workshop/t-zero, and all prompts are available at https://github.com/bigscience-workshop/promptsource.

Cite

Text

Sanh et al. "Multitask Prompted Training Enables Zero-Shot Task Generalization." International Conference on Learning Representations, 2022.

Markdown

[Sanh et al. "Multitask Prompted Training Enables Zero-Shot Task Generalization." International Conference on Learning Representations, 2022.](https://mlanthology.org/iclr/2022/sanh2022iclr-multitask/)

BibTeX

@inproceedings{sanh2022iclr-multitask,
  title     = {{Multitask Prompted Training Enables Zero-Shot Task Generalization}},
  author    = {Sanh, Victor and Webson, Albert and Raffel, Colin and Bach, Stephen and Sutawika, Lintang and Alyafeai, Zaid and Chaffin, Antoine and Stiegler, Arnaud and Raja, Arun and Dey, Manan and Bari, M Saiful and Xu, Canwen and Thakker, Urmish and Sharma, Shanya Sharma and Szczechla, Eliza and Kim, Taewoon and Chhablani, Gunjan and Nayak, Nihal and Datta, Debajyoti and Chang, Jonathan and Jiang, Mike Tian-Jian and Wang, Han and Manica, Matteo and Shen, Sheng and Yong, Zheng Xin and Pandey, Harshit and Bawden, Rachel and Wang, Thomas and Neeraj, Trishala and Rozen, Jos and Sharma, Abheesht and Santilli, Andrea and Fevry, Thibault and Fries, Jason Alan and Teehan, Ryan and Le Scao, Teven and Biderman, Stella and Gao, Leo and Wolf, Thomas and Rush, Alexander M},
  booktitle = {International Conference on Learning Representations},
  year      = {2022},
  url       = {https://mlanthology.org/iclr/2022/sanh2022iclr-multitask/}
}