A Logic for Expressing Log-Precision Transformers

Abstract

One way to interpret the reasoning power of transformer-based language models is to describe the types of logical rules they can resolve over some input text. Recently, Chiang et al. (2023) showed that finite-precision transformer classifiers can be equivalently expressed in a generalization of first-order logic. However, finite-precision transformers are a weak transformer variant because, as we show, a single head can only attend to a constant number of tokens and, in particular, cannot represent uniform attention. Since attending broadly is a core capability for transformers, we ask whether a minimally more expressive model that can attend universally can also be characterized in logic. To this end, we analyze transformers whose forward pass is computed in $\log n$ precision on contexts of length $n$. We prove any log-precision transformer classifier can be equivalently expressed as a first-order logic sentence that, in addition to standard universal and existential quantifiers, may also contain majority-vote quantifiers. This is the tightest known upper bound and first logical characterization of log-precision transformers.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Merrill and Sabharwal. "A Logic for Expressing Log-Precision Transformers." Neural Information Processing Systems, 2023.

Markdown

[Merrill and Sabharwal. "A Logic for Expressing Log-Precision Transformers." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/merrill2023neurips-logic/)

BibTeX

@inproceedings{merrill2023neurips-logic,
  title     = {{A Logic for Expressing Log-Precision Transformers}},
  author    = {Merrill, William and Sabharwal, Ashish},
  booktitle = {Neural Information Processing Systems},
  year      = {2023},
  url       = {https://mlanthology.org/neurips/2023/merrill2023neurips-logic/}
}