Alternating Language Modeling for Cross-Lingual Pre-Training
Abstract
Language model pre-training has achieved success in many natural language processing tasks. Existing methods for cross-lingual pre-training adopt Translation Language Model to predict masked words with the concatenation of the source sentence and its target equivalent. In this work, we introduce a novel cross-lingual pre-training method, called Alternating Language Modeling (ALM). It code-switches sentences of different languages rather than simple concatenation, hoping to capture the rich cross-lingual context of words and phrases. More specifically, we randomly substitute source phrases with target translations to create code-switched sentences. Then, we use these code-switched data to train ALM model to learn to predict words of different languages. We evaluate our pre-training ALM on the downstream tasks of machine translation and cross-lingual classification. Experiments show that ALM can outperform the previous pre-training methods on three benchmarks.1
Cite
Text
Yang et al. "Alternating Language Modeling for Cross-Lingual Pre-Training." AAAI Conference on Artificial Intelligence, 2020. doi:10.1609/AAAI.V34I05.6480Markdown
[Yang et al. "Alternating Language Modeling for Cross-Lingual Pre-Training." AAAI Conference on Artificial Intelligence, 2020.](https://mlanthology.org/aaai/2020/yang2020aaai-alternating/) doi:10.1609/AAAI.V34I05.6480BibTeX
@inproceedings{yang2020aaai-alternating,
title = {{Alternating Language Modeling for Cross-Lingual Pre-Training}},
author = {Yang, Jian and Ma, Shuming and Zhang, Dongdong and Wu, Shuangzhi and Li, Zhoujun and Zhou, Ming},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2020},
pages = {9386-9393},
doi = {10.1609/AAAI.V34I05.6480},
url = {https://mlanthology.org/aaai/2020/yang2020aaai-alternating/}
}