Attention and Augmented Recurrent Neural Networks
Abstract
The basic RNN design struggles with longer sequences, but a special variant—“long short-term memory” networks —can even work with these. Such models have been found to be very powerful, achieving remarkable results in many tasks including translation, voice recognition, and image captioning. As a result, recurrent neural networks have become very widespread in the last few years. As this has happened, we’ve seen a growing number of attempts to augment RNNs with new properties. Four directions stand out as particularly exciting:
Cite
Text
Olah and Carter. "Attention and Augmented Recurrent Neural Networks." Distill, 2016. doi:10.23915/distill.00001Markdown
[Olah and Carter. "Attention and Augmented Recurrent Neural Networks." Distill, 2016.](https://mlanthology.org/distill/2016/olah2016distill-attention/) doi:10.23915/distill.00001BibTeX
@article{olah2016distill-attention,
title = {{Attention and Augmented Recurrent Neural Networks}},
author = {Olah, Chris and Carter, Shan},
journal = {Distill},
year = {2016},
doi = {10.23915/distill.00001},
url = {https://mlanthology.org/distill/2016/olah2016distill-attention/}
}