Self-Supervised and Supervised Joint Training for Resource-Rich Machine Translation

Abstract

Self-supervised pre-training of text representations has been successfully applied to low-resource Neural Machine Translation (NMT). However, it usually fails to achieve notable gains on resource-rich NMT. In this paper, we propose a joint training approach, F2-XEnDec, to combine self-supervised and supervised learning to optimize NMT models. To exploit complementary self-supervised signals for supervised learning, NMT models are trained on examples that are interbred from monolingual and parallel sentences through a new process called crossover encoder-decoder. Experiments on two resource-rich translation benchmarks, WMT’14 English-German and WMT’14 English-French, demonstrate that our approach achieves substantial improvements over several strong baseline methods and obtains a new state of the art of 46.19 BLEU on English-French when incorporating back translation. Results also show that our approach is capable of improving model robustness to input perturbations such as code-switching noise which frequently appears on the social media.

Cite

Text

Cheng et al. "Self-Supervised and Supervised Joint Training for Resource-Rich Machine Translation." International Conference on Machine Learning, 2021.

Markdown

[Cheng et al. "Self-Supervised and Supervised Joint Training for Resource-Rich Machine Translation." International Conference on Machine Learning, 2021.](https://mlanthology.org/icml/2021/cheng2021icml-selfsupervised/)

BibTeX

@inproceedings{cheng2021icml-selfsupervised,
  title     = {{Self-Supervised and Supervised Joint Training for Resource-Rich Machine Translation}},
  author    = {Cheng, Yong and Wang, Wei and Jiang, Lu and Macherey, Wolfgang},
  booktitle = {International Conference on Machine Learning},
  year      = {2021},
  pages     = {1825-1835},
  volume    = {139},
  url       = {https://mlanthology.org/icml/2021/cheng2021icml-selfsupervised/}
}