Multilingual Neural Machine Translation with Soft Decoupled Encoding

Xinyi Wang, Hieu Pham, Philip Arthur, Graham Neubig

ICLR 2019

/iclr/2019/wang2019iclr-multilingual/

Abstract

Multilingual training of neural machine translation (NMT) systems has led to impressive accuracy improvements on low-resource languages. However, there are still significant challenges in efficiently learning word representations in the face of paucity of data. In this paper, we propose Soft Decoupled Encoding (SDE), a multilingual lexicon encoding framework specifically designed to share lexical-level information intelligently without requiring heuristic preprocessing such as pre-segmenting the data. SDE represents a word by its spelling through a character encoding, and its semantic meaning through a latent embedding space shared by all languages. Experiments on a standard dataset of four low-resource languages show consistent improvements over strong multilingual NMT baselines, with gains of up to 2 BLEU on one of the tested languages, achieving the new state-of-the-art on all four language pairs.

PDF ICLR Code Semantic Scholar

Cite

Text

Wang et al. "Multilingual Neural Machine Translation with Soft Decoupled Encoding." International Conference on Learning Representations, 2019.

Markdown

[Wang et al. "Multilingual Neural Machine Translation with Soft Decoupled Encoding." International Conference on Learning Representations, 2019.](https://mlanthology.org/iclr/2019/wang2019iclr-multilingual/)

BibTeX

@inproceedings{wang2019iclr-multilingual,
  title     = {{Multilingual Neural Machine Translation with Soft Decoupled Encoding}},
  author    = {Wang, Xinyi and Pham, Hieu and Arthur, Philip and Neubig, Graham},
  booktitle = {International Conference on Learning Representations},
  year      = {2019},
  url       = {https://mlanthology.org/iclr/2019/wang2019iclr-multilingual/}
}