Neural Machine Translation with Gumbel-Greedy Decoding
Abstract
Previous neural machine translation models used some heuristic search algorithms (e.g., beam search) in order to avoid solving the maximum a posteriori problem over translation sentences at test phase. In this paper, we propose the \textit{Gumbel-Greedy Decoding} which trains a generative network to predict translation under a trained model. We solve such a problem using the Gumbel-Softmax reparameterization, which makes our generative network differentiable and trainable through standard stochastic gradient methods. We empirically demonstrate that our proposed model is effective for generating sequences of discrete words.
Cite
Text
Gu et al. "Neural Machine Translation with Gumbel-Greedy Decoding." AAAI Conference on Artificial Intelligence, 2018. doi:10.1609/AAAI.V32I1.12016Markdown
[Gu et al. "Neural Machine Translation with Gumbel-Greedy Decoding." AAAI Conference on Artificial Intelligence, 2018.](https://mlanthology.org/aaai/2018/gu2018aaai-neural/) doi:10.1609/AAAI.V32I1.12016BibTeX
@inproceedings{gu2018aaai-neural,
title = {{Neural Machine Translation with Gumbel-Greedy Decoding}},
author = {Gu, Jiatao and Im, Daniel Jiwoong and Li, Victor O. K.},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2018},
pages = {5125-5132},
doi = {10.1609/AAAI.V32I1.12016},
url = {https://mlanthology.org/aaai/2018/gu2018aaai-neural/}
}