Nearest Neighbor Machine Translation

Abstract

We introduce $k$-nearest-neighbor machine translation ($k$NN-MT), which predicts tokens with a nearest-neighbor classifier over a large datastore of cached examples, using representations from a neural translation model for similarity search. This approach requires no additional training and scales to give the decoder direct access to billions of examples at test time, resulting in a highly expressive model that consistently improves performance across many settings. Simply adding nearest-neighbor search improves a state-of-the-art German-English translation model by 1.5 BLEU. $k$NN-MT allows a single model to be adapted to diverse domains by using a domain-specific datastore, improving results by an average of 9.2 BLEU over zero-shot transfer, and achieving new state-of-the-art results---without training on these domains. A massively multilingual model can also be specialized for particular language pairs, with improvements of 3 BLEU for translating from English into German and Chinese. Qualitatively, $k$NN-MT is easily interpretable; it combines source and target context to retrieve highly relevant examples.

Cite

Text

Khandelwal et al. "Nearest Neighbor Machine Translation." International Conference on Learning Representations, 2021.

Markdown

[Khandelwal et al. "Nearest Neighbor Machine Translation." International Conference on Learning Representations, 2021.](https://mlanthology.org/iclr/2021/khandelwal2021iclr-nearest/)

BibTeX

@inproceedings{khandelwal2021iclr-nearest,
  title     = {{Nearest Neighbor Machine Translation}},
  author    = {Khandelwal, Urvashi and Fan, Angela and Jurafsky, Dan and Zettlemoyer, Luke and Lewis, Mike},
  booktitle = {International Conference on Learning Representations},
  year      = {2021},
  url       = {https://mlanthology.org/iclr/2021/khandelwal2021iclr-nearest/}
}