Plex: Towards Reliability Using Pretrained Large Model Extensions

ICMLW 2022

/icmlw/2022/tran2022icmlw-plex/

Abstract

A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures. Probing these models' abilities in diverse ways is therefore critical to the field. In this paper, we explore the reliability of models, where we define a reliable model as one that not only achieves strong predictive performance but also performs well consistently over many decision-making tasks involving uncertainty (e.g., selective prediction, open set recognition), robust generalization (e.g., accuracy and proper scoring rules such as log-likelihood on in- and out-of-distribution datasets), and adaptation (e.g., active learning, few-shot uncertainty). We devise 10 types of tasks over 38 datasets in order to evaluate different aspects of reliability on both vision and language domains. To improve reliability, we developed ViT-Plex and T5-Plex, pretrained large model extensions (plex) for vision and language modalities, respectively. Plex greatly improves the state-of-the-art across reliability tasks, and simplifies the traditional protocol as it does not require designing scores or tuning the model for each individual task. We demonstrate scaling effects over model sizes up to 1B parameters and pretraining dataset sizes up to 4B examples. We also demonstrate Plex's capabilities on challenging tasks including zero-shot open set recognition, active learning, and uncertainty in conversational language understanding.

PDF ICMLW OpenReview Semantic Scholar

Cite

Text

Tran et al. "Plex: Towards Reliability Using Pretrained Large Model Extensions." ICML 2022 Workshops: Pre-Training, 2022.

Markdown

[Tran et al. "Plex: Towards Reliability Using Pretrained Large Model Extensions." ICML 2022 Workshops: Pre-Training, 2022.](https://mlanthology.org/icmlw/2022/tran2022icmlw-plex/)

BibTeX

@inproceedings{tran2022icmlw-plex,
  title     = {{Plex: Towards Reliability Using Pretrained Large Model Extensions}},
  author    = {Tran, Dustin and Liu, Jeremiah Zhe and Dusenberry, Michael W and Phan, Du and Collier, Mark and Ren, Jie and Han, Kehang and Wang, Zi and Mariet, Zelda E and Hu, Huiyi and Band, Neil and Rudner, Tim G. J. and Singhal, Karan and Nado, Zachary and van Amersfoort, Joost and Kirsch, Andreas and Jenatton, Rodolphe and Thain, Nithum and Yuan, Honglin and Buchanan, E. Kelly and Murphy, Kevin Patrick and Sculley, D. and Gal, Yarin and Ghahramani, Zoubin and Snoek, Jasper and Lakshminarayanan, Balaji},
  booktitle = {ICML 2022 Workshops: Pre-Training},
  year      = {2022},
  url       = {https://mlanthology.org/icmlw/2022/tran2022icmlw-plex/}
}