Visualizing and Measuring the Geometry of BERT

Emily Reif, Ann Yuan, Martin Wattenberg, Fernanda B Viegas, Andy Coenen, Adam Pearce, Been Kim

NeurIPS 2019 pp. 8594-8603

/neurips/2019/reif2019neurips-visualizing/

Abstract

Transformer architectures show significant promise for natural language processing. Given that a single pretrained model can be fine-tuned to perform well on many different tasks, these networks appear to extract generally useful linguistic features. A natural question is how such networks represent this information internally. This paper describes qualitative and quantitative investigations of one particularly effective model, BERT. At a high level, linguistic features seem to be represented in separate semantic and syntactic subspaces. We find evidence of a fine-grained geometric representation of word senses. We also present empirical descriptions of syntactic representations in both attention matrices and individual word embeddings, as well as a mathematical argument to explain the geometry of these representations.

PDF NeurIPS Semantic Scholar

Cite

Text

Reif et al. "Visualizing and Measuring the Geometry of BERT." Neural Information Processing Systems, 2019.

Markdown

[Reif et al. "Visualizing and Measuring the Geometry of BERT." Neural Information Processing Systems, 2019.](https://mlanthology.org/neurips/2019/reif2019neurips-visualizing/)

BibTeX

@inproceedings{reif2019neurips-visualizing,
  title     = {{Visualizing and Measuring the Geometry of BERT}},
  author    = {Reif, Emily and Yuan, Ann and Wattenberg, Martin and Viegas, Fernanda B and Coenen, Andy and Pearce, Adam and Kim, Been},
  booktitle = {Neural Information Processing Systems},
  year      = {2019},
  pages     = {8594-8603},
  url       = {https://mlanthology.org/neurips/2019/reif2019neurips-visualizing/}
}