The Transformer Cookbook

Abstract

We present the transformer cookbook: a collection of techniques for directly encoding algorithms into a transformer's parameters. This work addresses the steep learning curve of such endeavors, a problem exacerbated by a fragmented literature where key results are scattered across numerous papers. In particular, we synthesize this disparate body of findings into a curated set of recipes that demonstrate how to implement everything from basic arithmetic in feed-forward layers to complex data routing via self-attention. Our mise en place of formulations is for both newcomers seeking an accessible entry point and experts in need of a systematic reference. This unified presentation of transformer constructions provides a foundation for future work spanning theoretical research in computational complexity to empirical investigations in architecture design and interpretability. We provide code implementations of each construction in numpy alongside a suite of generative unit tests.

Cite

Text

Yang et al. "The Transformer Cookbook." Transactions on Machine Learning Research, 2026.

Markdown

[Yang et al. "The Transformer Cookbook." Transactions on Machine Learning Research, 2026.](https://mlanthology.org/tmlr/2026/yang2026tmlr-transformer/)

BibTeX

@article{yang2026tmlr-transformer,
  title     = {{The Transformer Cookbook}},
  author    = {Yang, Andy and Watson, Christopher and Xue, Anton and Bhattamishra, Satwik and Llarena, Jose and Merrill, William and Ferreira, Emile Dos Santos and Svete, Anej and Chiang, David},
  journal   = {Transactions on Machine Learning Research},
  year      = {2026},
  url       = {https://mlanthology.org/tmlr/2026/yang2026tmlr-transformer/}
}