The Transformer Cookbook
Abstract
We present the transformer cookbook: a collection of techniques for directly encoding algorithms into a transformer's parameters. This work addresses the steep learning curve of such endeavors, a problem exacerbated by a fragmented literature where key results are scattered across numerous papers. In particular, we synthesize this disparate body of findings into a curated set of recipes that demonstrate how to implement everything from basic arithmetic in feed-forward layers to complex data routing via self-attention. Our mise en place of formulations is for both newcomers seeking an accessible entry point and experts in need of a systematic reference. This unified presentation of transformer constructions provides a foundation for future work spanning theoretical research in computational complexity to empirical investigations in architecture design and interpretability. We provide code implementations of each construction in numpy alongside a suite of generative unit tests.
Cite
Text
Yang et al. "The Transformer Cookbook." Transactions on Machine Learning Research, 2026.Markdown
[Yang et al. "The Transformer Cookbook." Transactions on Machine Learning Research, 2026.](https://mlanthology.org/tmlr/2026/yang2026tmlr-transformer/)BibTeX
@article{yang2026tmlr-transformer,
title = {{The Transformer Cookbook}},
author = {Yang, Andy and Watson, Christopher and Xue, Anton and Bhattamishra, Satwik and Llarena, Jose and Merrill, William and Ferreira, Emile Dos Santos and Svete, Anej and Chiang, David},
journal = {Transactions on Machine Learning Research},
year = {2026},
url = {https://mlanthology.org/tmlr/2026/yang2026tmlr-transformer/}
}