Empirical Regularization for Synthetic Sentence Pairs in Unsupervised Neural Machine Translation
Abstract
UNMT tackles translation on monolingual corpora in two required languages. Since there is no explicitly cross-lingual signal, pre-training and synthetic sentence pairs are significant to the success of UNMT. In this work, we empirically study the core training procedure of UNMT to analyze the synthetic sentence pairs obtained from back-translation. We introduce new losses to UNMT to regularize the synthetic sentence pairs by jointly training the UNMT objective and the regularization objective. Our comprehensive experiments support that our method can generally improve the performance of currently successful models on three similar pairs French, German, Romanian English and one dissimilar pair Russian English with acceptably additional cost.
Cite
Text
Ai and Fang. "Empirical Regularization for Synthetic Sentence Pairs in Unsupervised Neural Machine Translation." AAAI Conference on Artificial Intelligence, 2021. doi:10.1609/AAAI.V35I14.17479Markdown
[Ai and Fang. "Empirical Regularization for Synthetic Sentence Pairs in Unsupervised Neural Machine Translation." AAAI Conference on Artificial Intelligence, 2021.](https://mlanthology.org/aaai/2021/ai2021aaai-empirical/) doi:10.1609/AAAI.V35I14.17479BibTeX
@inproceedings{ai2021aaai-empirical,
title = {{Empirical Regularization for Synthetic Sentence Pairs in Unsupervised Neural Machine Translation}},
author = {Ai, Xi and Fang, Bin},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2021},
pages = {12471-12479},
doi = {10.1609/AAAI.V35I14.17479},
url = {https://mlanthology.org/aaai/2021/ai2021aaai-empirical/}
}