Is Transformer a Stochastic Parrot? a Case Study in Simple Arithmetic Task
Abstract
Large pretrained language models have demonstrated impressive capabilities, but there is still much to learn about how they operate mechanically. In this study, we conduct a multifaceted investigation of the autoregressive transformer's ability to perform basic addition operations. Specifically, we use casual tracing to locate the information flow between attention and the fully-connected layer. For attention layers, we found that they exploit fixed patterns in the intermediate stage to perform the transfer of carry and numeric information. They project the input onto the distribution of a few neurons in later fully-connected layers, where the neurons activate the vocabulary distribution existing in the parameter space to implement the mapping relationship. In addition, our research can be further extended to the study of interpretability of general classification tasks like sentiment analysis. The findings suggest that, although the model appears to have learned some arithmetic rules, most of its reasoning still relies on statistical patterns.
Cite
Text
Wang et al. "Is Transformer a Stochastic Parrot? a Case Study in Simple Arithmetic Task." ICML 2024 Workshops: MI, 2024.Markdown
[Wang et al. "Is Transformer a Stochastic Parrot? a Case Study in Simple Arithmetic Task." ICML 2024 Workshops: MI, 2024.](https://mlanthology.org/icmlw/2024/wang2024icmlw-transformer/)BibTeX
@inproceedings{wang2024icmlw-transformer,
title = {{Is Transformer a Stochastic Parrot? a Case Study in Simple Arithmetic Task}},
author = {Wang, Peixu and Yu, Chen and Ming, Yu},
booktitle = {ICML 2024 Workshops: MI},
year = {2024},
url = {https://mlanthology.org/icmlw/2024/wang2024icmlw-transformer/}
}