Do RNN and LSTM Have Long Memory?

Abstract

The LSTM network was proposed to overcome the difficulty in learning long-term dependence, and has made significant advancements in applications. With its success and drawbacks in mind, this paper raises the question - do RNN and LSTM have long memory? We answer it partially by proving that RNN and LSTM do not have long memory from a statistical perspective. A new definition for long memory networks is further introduced, and it requires the model weights to decay at a polynomial rate. To verify our theory, we convert RNN and LSTM into long memory networks by making a minimal modification, and their superiority is illustrated in modeling long-term dependence of various datasets.

Cite

Text

Zhao et al. "Do RNN and LSTM Have Long Memory?." International Conference on Machine Learning, 2020.

Markdown

[Zhao et al. "Do RNN and LSTM Have Long Memory?." International Conference on Machine Learning, 2020.](https://mlanthology.org/icml/2020/zhao2020icml-rnn/)

BibTeX

@inproceedings{zhao2020icml-rnn,
  title     = {{Do RNN and LSTM Have Long Memory?}},
  author    = {Zhao, Jingyu and Huang, Feiqing and Lv, Jia and Duan, Yanjie and Qin, Zhen and Li, Guodong and Tian, Guangjian},
  booktitle = {International Conference on Machine Learning},
  year      = {2020},
  pages     = {11365-11375},
  volume    = {119},
  url       = {https://mlanthology.org/icml/2020/zhao2020icml-rnn/}
}