Self-Supervised and Supervised Joint Training for Resource-Rich Machine Translation
Abstract
Self-supervised pre-training of text representations has been successfully applied to low-resource Neural Machine Translation (NMT). However, it usually fails to achieve notable gains on resource-rich NMT. In this paper, we propose a joint training approach, F2-XEnDec, to combine self-supervised and supervised learning to optimize NMT models. To exploit complementary self-supervised signals for supervised learning, NMT models are trained on examples that are interbred from monolingual and parallel sentences through a new process called crossover encoder-decoder. Experiments on two resource-rich translation benchmarks, WMT’14 English-German and WMT’14 English-French, demonstrate that our approach achieves substantial improvements over several strong baseline methods and obtains a new state of the art of 46.19 BLEU on English-French when incorporating back translation. Results also show that our approach is capable of improving model robustness to input perturbations such as code-switching noise which frequently appears on the social media.
Cite
Text
Cheng et al. "Self-Supervised and Supervised Joint Training for Resource-Rich Machine Translation." International Conference on Machine Learning, 2021.Markdown
[Cheng et al. "Self-Supervised and Supervised Joint Training for Resource-Rich Machine Translation." International Conference on Machine Learning, 2021.](https://mlanthology.org/icml/2021/cheng2021icml-selfsupervised/)BibTeX
@inproceedings{cheng2021icml-selfsupervised,
title = {{Self-Supervised and Supervised Joint Training for Resource-Rich Machine Translation}},
author = {Cheng, Yong and Wang, Wei and Jiang, Lu and Macherey, Wolfgang},
booktitle = {International Conference on Machine Learning},
year = {2021},
pages = {1825-1835},
volume = {139},
url = {https://mlanthology.org/icml/2021/cheng2021icml-selfsupervised/}
}