Beyond Intuition: Rethinking Token Attributions Inside Transformers

Abstract

The multi-head attention mechanism, or rather the Transformer-based models have always been under the spotlight, not only in the domain of text processing, but also for computer vision. Several works have recently been proposed around exploring the token attributions along the intrinsic decision process. However, the ambiguity of the expression formulation can lead to an accumulation of error, which makes the interpretation less trustworthy and less applicable to different variants. In this work, we propose a novel method to approximate token contributions inside Transformers. We start from the partial derivative to each token, divide the interpretation process into attention perception and reasoning feedback with the chain rule and explore each part individually with explicit mathematical derivations. In attention perception, we propose the head-wise and token-wise approximations in order to learn how the tokens interact to form the pooled vector. As for reasoning feedback, we adopt a noise-decreasing strategy by applying the integrated gradients to the last attention map. Our method is further validated qualitatively and quantitatively through the faithfulness evaluations across different settings: single modality (BERT and ViT) and bi-modality (CLIP), different model sizes (ViT-L) and different pooling strategies (ViT-MAE) to demonstrate the broad applicability and clear improvements over existing methods.

Cite

Text

Chen et al. "Beyond Intuition: Rethinking Token Attributions Inside Transformers." Transactions on Machine Learning Research, 2023.

Markdown

[Chen et al. "Beyond Intuition: Rethinking Token Attributions Inside Transformers." Transactions on Machine Learning Research, 2023.](https://mlanthology.org/tmlr/2023/chen2023tmlr-beyond/)

BibTeX

@article{chen2023tmlr-beyond,
  title     = {{Beyond Intuition: Rethinking Token Attributions Inside Transformers}},
  author    = {Chen, Jiamin and Li, Xuhong and Yu, Lei and Dou, Dejing and Xiong, Haoyi},
  journal   = {Transactions on Machine Learning Research},
  year      = {2023},
  url       = {https://mlanthology.org/tmlr/2023/chen2023tmlr-beyond/}
}