Looped Transformers as Programmable Computers

Abstract

We present a framework for using transformer networks as universal computers by programming them with specific weights and placing them in a loop. Our input sequence acts as a punchcard, consisting of instructions and memory for data read/writes. We demonstrate that a constant number of encoder layers can emulate basic computing blocks, including lexicographic operations, non-linear functions, function calls, program counters, and conditional branches. Using this framework, we emulate a computer using a simple instruction-set architecture, which allows us to map iterative algorithms to programs that can be executed by a constant depth looped transformer network. We show how a single frozen transformer, instructed by its input, can emulate a basic calculator, a basic linear algebra library, and even a full backpropagation, in-context learning algorithm. Our findings reveal the potential of transformer networks as programmable compute units and offer insight into the mechanics of attention.

Cite

Text

Giannou et al. "Looped Transformers as Programmable Computers." ICLR 2023 Workshops: ME-FoMo, 2023.

Markdown

[Giannou et al. "Looped Transformers as Programmable Computers." ICLR 2023 Workshops: ME-FoMo, 2023.](https://mlanthology.org/iclrw/2023/giannou2023iclrw-looped/)

BibTeX

@inproceedings{giannou2023iclrw-looped,
  title     = {{Looped Transformers as Programmable Computers}},
  author    = {Giannou, Angeliki and Rajput, Shashank and Sohn, Jy-yong and Lee, Kangwook and Lee, Jason D. and Papailiopoulos, Dimitris},
  booktitle = {ICLR 2023 Workshops: ME-FoMo},
  year      = {2023},
  url       = {https://mlanthology.org/iclrw/2023/giannou2023iclrw-looped/}
}