Efficient Context-Aware Neural Machine Translation with Layer-Wise Weighting and Input-Aware Gating

Hongfei Xu, Deyi Xiong, Josef van Genabith, Qiuhui Liu

IJCAI 2020 pp. 3933-3940

doi:10.24963/IJCAI.2020/544 /ijcai/2020/xu2020ijcai-efficient/

Abstract

Existing Neural Machine Translation (NMT) systems are generally trained on a large amount of sentence-level parallel data, and during prediction sentences are independently translated, ignoring cross-sentence contextual information. This leads to inconsistency between translated sentences. In order to address this issue, context-aware models have been proposed. However, document-level parallel data constitutes only a small part of the parallel data available, and many approaches build context-aware models based on a pre-trained frozen sentence-level translation model in a two-step training manner. The computational cost of these approaches is usually high. In this paper, we propose to make the most of layers pre-trained on sentence-level data in contextual representation learning, reusing representations from the sentence-level Transformer and significantly reducing the cost of incorporating contexts in translation. We find that representations from shallow layers of a pre-trained sentence-level encoder play a vital role in source context encoding, and propose to perform source context encoding upon weighted combinations of pre-trained encoder layers' outputs. Instead of separately performing source context and input encoding, we propose to iteratively and jointly encode the source input and its contexts and to generate input-aware context representations with a cross-attention layer and a gating mechanism, which resets irrelevant information in context encoding. Our context-aware Transformer model outperforms the recent CADec [Voita et al., 2019c] on the English-Russian subtitle data and is about twice as fast in training and decoding.

PDF IJCAI Semantic Scholar

Cite

Text

Xu et al. "Efficient Context-Aware Neural Machine Translation with Layer-Wise Weighting and Input-Aware Gating." International Joint Conference on Artificial Intelligence, 2020. doi:10.24963/IJCAI.2020/544

Markdown

[Xu et al. "Efficient Context-Aware Neural Machine Translation with Layer-Wise Weighting and Input-Aware Gating." International Joint Conference on Artificial Intelligence, 2020.](https://mlanthology.org/ijcai/2020/xu2020ijcai-efficient/) doi:10.24963/IJCAI.2020/544

BibTeX

@inproceedings{xu2020ijcai-efficient,
  title     = {{Efficient Context-Aware Neural Machine Translation with Layer-Wise Weighting and Input-Aware Gating}},
  author    = {Xu, Hongfei and Xiong, Deyi and van Genabith, Josef and Liu, Qiuhui},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2020},
  pages     = {3933-3940},
  doi       = {10.24963/IJCAI.2020/544},
  url       = {https://mlanthology.org/ijcai/2020/xu2020ijcai-efficient/}
}