Do RNN and LSTM Have Long Memory?
Abstract
The LSTM network was proposed to overcome the difficulty in learning long-term dependence, and has made significant advancements in applications. With its success and drawbacks in mind, this paper raises the question - do RNN and LSTM have long memory? We answer it partially by proving that RNN and LSTM do not have long memory from a statistical perspective. A new definition for long memory networks is further introduced, and it requires the model weights to decay at a polynomial rate. To verify our theory, we convert RNN and LSTM into long memory networks by making a minimal modification, and their superiority is illustrated in modeling long-term dependence of various datasets.
Cite
Text
Zhao et al. "Do RNN and LSTM Have Long Memory?." International Conference on Machine Learning, 2020.Markdown
[Zhao et al. "Do RNN and LSTM Have Long Memory?." International Conference on Machine Learning, 2020.](https://mlanthology.org/icml/2020/zhao2020icml-rnn/)BibTeX
@inproceedings{zhao2020icml-rnn,
title = {{Do RNN and LSTM Have Long Memory?}},
author = {Zhao, Jingyu and Huang, Feiqing and Lv, Jia and Duan, Yanjie and Qin, Zhen and Li, Guodong and Tian, Guangjian},
booktitle = {International Conference on Machine Learning},
year = {2020},
pages = {11365-11375},
volume = {119},
url = {https://mlanthology.org/icml/2020/zhao2020icml-rnn/}
}