The Unreasonable Effectiveness of Few-Shot Learning for Machine Translation
Abstract
We demonstrate the potential of few-shot translation systems, trained with unpaired language data, for both high and low-resource language pairs. We show that with only 5 examples of high-quality translation data shown at inference, a transformer decoder-only model trained solely with self-supervised learning, is able to match specialized supervised state-of-the-art models as well as more general commercial translation systems. In particular, we outperform the best performing system on the WMT’21 English-Chinese news translation task by only using five examples of English-Chinese parallel data at inference. Furthermore, the resulting models are two orders of magnitude smaller than state-of-the-art language models. We then analyze the factors which impact the performance of few-shot translation systems, and highlight that the quality of the few-shot demonstrations heavily determines the quality of the translations generated by our models. Finally, we show that the few-shot paradigm also provides a way to control certain attributes of the translation — we show that we are able to control for regional varieties and formality using only a five examples at inference, paving the way towards controllable machine translation systems.
Cite
Text
Garcia et al. "The Unreasonable Effectiveness of Few-Shot Learning for Machine Translation." International Conference on Machine Learning, 2023.Markdown
[Garcia et al. "The Unreasonable Effectiveness of Few-Shot Learning for Machine Translation." International Conference on Machine Learning, 2023.](https://mlanthology.org/icml/2023/garcia2023icml-unreasonable/)BibTeX
@inproceedings{garcia2023icml-unreasonable,
title = {{The Unreasonable Effectiveness of Few-Shot Learning for Machine Translation}},
author = {Garcia, Xavier and Bansal, Yamini and Cherry, Colin and Foster, George and Krikun, Maxim and Johnson, Melvin and Firat, Orhan},
booktitle = {International Conference on Machine Learning},
year = {2023},
pages = {10867-10878},
volume = {202},
url = {https://mlanthology.org/icml/2023/garcia2023icml-unreasonable/}
}