TOAST: Transformer Optimization Using Adaptive and Simple Transformations
Abstract
Foundation models achieve state-of-the-art performance across different tasks, but their size and computational demands raise concerns about accessibility and sustainability. Existing efficiency methods often require additional retraining or finetuning, limiting their practicality. Recent findings suggest that deep neural networks exhibit internal representation similarities. While such similarities across different models have been exploited for enabling techniques such as model stitching and merging, intra-network redundancy remains underexplored as a source for efficiency gains. In this paper, we introduce Transformer Optimization using Adaptive and Simple Transformations (TOAST), a framework that exploits these redundancies to approximate entire transformer blocks with lightweight closed-form mappings, such as linear transformations or even the identity function, without any additional training. Across state-of-the-art pretrained vision models (e.g., ViT, DINOv2, DeiT) and datasets ranging from MNIST to ImageNet-1k, TOAST reduces parameters and computation while preserving, and in some cases improving, downstream performance. These results show that large portions of transformer depth can be replaced by trivial functions, opening a new perspective on efficient foundation models.
Cite
Text
Cannistraci et al. "TOAST: Transformer Optimization Using Adaptive and Simple Transformations." Transactions on Machine Learning Research, 2026.Markdown
[Cannistraci et al. "TOAST: Transformer Optimization Using Adaptive and Simple Transformations." Transactions on Machine Learning Research, 2026.](https://mlanthology.org/tmlr/2026/cannistraci2026tmlr-toast/)BibTeX
@article{cannistraci2026tmlr-toast,
title = {{TOAST: Transformer Optimization Using Adaptive and Simple Transformations}},
author = {Cannistraci, Irene and Antonelli, Simone and Palumbo, Emanuele and Sutter, Thomas M. and Rodolà, Emanuele and Rieck, Bastian and Vogt, Julia E},
journal = {Transactions on Machine Learning Research},
year = {2026},
url = {https://mlanthology.org/tmlr/2026/cannistraci2026tmlr-toast/}
}