Perceiver IO: A General Architecture for Structured Inputs & Outputs

Abstract

A central goal of machine learning is the development of systems that can solve many problems in as many data domains as possible. Current architectures, however, cannot be applied beyond a small set of stereotyped settings, as they bake in domain & task assumptions or scale poorly to large inputs or outputs. In this work, we propose Perceiver IO, a general-purpose architecture that handles data from arbitrary settings while scaling linearly with the size of inputs and outputs. Our model augments the Perceiver with a flexible querying mechanism that enables outputs of various sizes and semantics, doing away with the need for task-specific architecture engineering. The same architecture achieves strong results on tasks spanning natural language and visual understanding, multi-task and multi-modal reasoning, and StarCraft II. As highlights, Perceiver IO outperforms a Transformer-based BERT baseline on the GLUE language benchmark despite removing input tokenization and achieves state-of-the-art performance on Sintel optical flow estimation with no explicit mechanisms for multiscale correspondence.

Cite

Text

Jaegle et al. "Perceiver IO: A General Architecture for Structured Inputs & Outputs." International Conference on Learning Representations, 2022.

Markdown

[Jaegle et al. "Perceiver IO: A General Architecture for Structured Inputs & Outputs." International Conference on Learning Representations, 2022.](https://mlanthology.org/iclr/2022/jaegle2022iclr-perceiver/)

BibTeX

@inproceedings{jaegle2022iclr-perceiver,
  title     = {{Perceiver IO: A General Architecture for Structured Inputs & Outputs}},
  author    = {Jaegle, Andrew and Borgeaud, Sebastian and Alayrac, Jean-Baptiste and Doersch, Carl and Ionescu, Catalin and Ding, David and Koppula, Skanda and Zoran, Daniel and Brock, Andrew and Shelhamer, Evan and Henaff, Olivier J and Botvinick, Matthew and Zisserman, Andrew and Vinyals, Oriol and Carreira, Joao},
  booktitle = {International Conference on Learning Representations},
  year      = {2022},
  url       = {https://mlanthology.org/iclr/2022/jaegle2022iclr-perceiver/}
}