Novel Positional Encodings to Enable Tree-Based Transformers

Abstract

Neural models optimized for tree-based problems are of great value in tasks like SQL query extraction and program synthesis. On sequence-structured data, transformers have been shown to learn relationships across arbitrary pairs of positions more reliably than recurrent models. Motivated by this property, we propose a method to extend transformers to tree-structured data, enabling sequence-to-tree, tree-to-sequence, and tree-to-tree mappings. Our approach abstracts the transformer's sinusoidal positional encodings, allowing us to instead use a novel positional encoding scheme to represent node positions within trees. We evaluated our model in tree-to-tree program translation and sequence-to-tree semantic parsing settings, achieving superior performance over both sequence-to-sequence transformers and state-of-the-art tree-based LSTMs on several datasets. In particular, our results include a 22% absolute increase in accuracy on a JavaScript to CoffeeScript translation dataset.

Cite

Text

Shiv and Quirk. "Novel Positional Encodings to Enable Tree-Based Transformers." Neural Information Processing Systems, 2019.

Markdown

[Shiv and Quirk. "Novel Positional Encodings to Enable Tree-Based Transformers." Neural Information Processing Systems, 2019.](https://mlanthology.org/neurips/2019/shiv2019neurips-novel/)

BibTeX

@inproceedings{shiv2019neurips-novel,
  title     = {{Novel Positional Encodings to Enable Tree-Based Transformers}},
  author    = {Shiv, Vighnesh and Quirk, Chris},
  booktitle = {Neural Information Processing Systems},
  year      = {2019},
  pages     = {12081-12091},
  url       = {https://mlanthology.org/neurips/2019/shiv2019neurips-novel/}
}