Recurrent Orthogonal Networks and Long-Memory Tasks

Abstract

Although RNNs have been shown to be power- ful tools for processing sequential data, finding architectures or optimization strategies that al- low them to model very long term dependencies is still an active area of research. In this work, we carefully analyze two synthetic datasets orig- inally outlined in (Hochreiter & Schmidhuber, 1997) which are used to evaluate the ability of RNNs to store information over many time steps. We explicitly construct RNN solutions to these problems, and using these constructions, illumi- nate both the problems themselves and the way in which RNNs store different types of information in their hidden states. These constructions fur- thermore explain the success of recent methods that specify unitary initializations or constraints on the transition matrices.

Cite

Text

Henaff et al. "Recurrent Orthogonal Networks and Long-Memory Tasks." International Conference on Machine Learning, 2016.

Markdown

[Henaff et al. "Recurrent Orthogonal Networks and Long-Memory Tasks." International Conference on Machine Learning, 2016.](https://mlanthology.org/icml/2016/henaff2016icml-recurrent/)

BibTeX

@inproceedings{henaff2016icml-recurrent,
  title     = {{Recurrent Orthogonal Networks and Long-Memory Tasks}},
  author    = {Henaff, Mikael and Szlam, Arthur and LeCun, Yann},
  booktitle = {International Conference on Machine Learning},
  year      = {2016},
  pages     = {2034-2042},
  volume    = {48},
  url       = {https://mlanthology.org/icml/2016/henaff2016icml-recurrent/}
}