Transformer-XH: Multi-Evidence Reasoning with eXtra Hop Attention

Abstract

Transformers have achieved new heights modeling natural language as a sequence of text tokens. However, in many real world scenarios, textual data inherently exhibits structures beyond a linear sequence such as trees and graphs; many tasks require reasoning with evidence scattered across multiple pieces of texts. This paper presents Transformer-XH, which uses eXtra Hop attention to enable intrinsic modeling of structured texts in a fully data-driven way. Its new attention mechanism naturally “hops” across the connected text sequences in addition to attending over tokens within each sequence. Thus, Transformer-XH better conducts joint multi-evidence reasoning by propagating information between documents and constructing global contextualized representations. On multi-hop question answering, Transformer-XH leads to a simpler multi-hop QA system which outperforms previous state-of-the-art on the HotpotQA FullWiki setting. On FEVER fact verification, applying Transformer-XH provides state-of-the-art accuracy and excels on claims whose verification requires multiple evidence.

Cite

Text

Zhao et al. "Transformer-XH: Multi-Evidence Reasoning with eXtra Hop Attention." International Conference on Learning Representations, 2020.

Markdown

[Zhao et al. "Transformer-XH: Multi-Evidence Reasoning with eXtra Hop Attention." International Conference on Learning Representations, 2020.](https://mlanthology.org/iclr/2020/zhao2020iclr-transformerxh/)

BibTeX

@inproceedings{zhao2020iclr-transformerxh,
  title     = {{Transformer-XH: Multi-Evidence Reasoning with eXtra Hop Attention}},
  author    = {Zhao, Chen and Xiong, Chenyan and Rosset, Corby and Song, Xia and Bennett, Paul and Tiwary, Saurabh},
  booktitle = {International Conference on Learning Representations},
  year      = {2020},
  url       = {https://mlanthology.org/iclr/2020/zhao2020iclr-transformerxh/}
}