Jump Self-Attention: Capturing High-Order Statistics in Transformers

Haoyi Zhou, Siyang Xiao, Shanghang Zhang, Jieqi Peng, Shuai Zhang, Jianxin Li

NeurIPS 2022

/neurips/2022/zhou2022neurips-jump/

Abstract

The recent success of Transformer has benefited many real-world applications, with its capability of building long dependency through pairwise dot-products. However, the strong assumption that elements are directly attentive to each other limits the performance of tasks with high-order dependencies such as natural language understanding and Image captioning. To solve such problems, we are the first to define the Jump Self-attention (JAT) to build Transformers. Inspired by the pieces moving of English Draughts, we introduce the spectral convolutional technique to calculate JAT on the dot-product feature map. This technique allows JAT's propagation in each self-attention head and is interchangeable with the canonical self-attention. We further develop the higher-order variants under the multi-hop assumption to increase the generality. Moreover, the proposed architecture is compatible with the pre-trained models. With extensive experiments, we empirically show that our methods significantly increase the performance on ten different tasks.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Zhou et al. "Jump Self-Attention: Capturing High-Order Statistics in Transformers." Neural Information Processing Systems, 2022.

Markdown

[Zhou et al. "Jump Self-Attention: Capturing High-Order Statistics in Transformers." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/zhou2022neurips-jump/)

BibTeX

@inproceedings{zhou2022neurips-jump,
  title     = {{Jump Self-Attention: Capturing High-Order Statistics in Transformers}},
  author    = {Zhou, Haoyi and Xiao, Siyang and Zhang, Shanghang and Peng, Jieqi and Zhang, Shuai and Li, Jianxin},
  booktitle = {Neural Information Processing Systems},
  year      = {2022},
  url       = {https://mlanthology.org/neurips/2022/zhou2022neurips-jump/}
}