Permutation Equivariance of Transformers and Its Applications

Abstract

Revolutionizing the field of deep learning Transformer-based models have achieved remarkable performance in many tasks. Recent research has recognized these models are robust to shuffling but are limited to inter-token permutation in the forward propagation. In this work we propose our definition of permutation equivariance a broader concept covering both inter- and intra- token permutation in the forward and backward propagation of neural networks. We rigorously proved that such permutation equivariance property can be satisfied on most vanilla Transformer-based models with almost no adaptation. We examine the property over a range of state-of-the-art models including ViT Bert GPT and others with experimental validations. Further as a proof-of-concept we explore how real-world applications including privacy-enhancing split learning and model authorization could exploit the permutation equivariance property which implicates wider intriguing application scenarios. The code is available at https://github.com/Doby-Xu/ST

Cite

Text

Xu et al. "Permutation Equivariance of Transformers and Its Applications." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.00572

Markdown

[Xu et al. "Permutation Equivariance of Transformers and Its Applications." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/xu2024cvpr-permutation/) doi:10.1109/CVPR52733.2024.00572

BibTeX

@inproceedings{xu2024cvpr-permutation,
  title     = {{Permutation Equivariance of Transformers and Its Applications}},
  author    = {Xu, Hengyuan and Xiang, Liyao and Ye, Hangyu and Yao, Dixi and Chu, Pengzhi and Li, Baochun},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {5987-5996},
  doi       = {10.1109/CVPR52733.2024.00572},
  url       = {https://mlanthology.org/cvpr/2024/xu2024cvpr-permutation/}
}