Recurrent Orthogonal Networks and Long-Memory Tasks

ICML 2016 pp. 2034-2042

/icml/2016/henaff2016icml-recurrent/

Abstract

Although RNNs have been shown to be power- ful tools for processing sequential data, finding architectures or optimization strategies that al- low them to model very long term dependencies is still an active area of research. In this work, we carefully analyze two synthetic datasets orig- inally outlined in (Hochreiter & Schmidhuber, 1997) which are used to evaluate the ability of RNNs to store information over many time steps. We explicitly construct RNN solutions to these problems, and using these constructions, illumi- nate both the problems themselves and the way in which RNNs store different types of information in their hidden states. These constructions fur- thermore explain the success of recent methods that specify unitary initializations or constraints on the transition matrices.

PDF ICML Semantic Scholar

Cite

Text

Henaff et al. "Recurrent Orthogonal Networks and Long-Memory Tasks." International Conference on Machine Learning, 2016.

Markdown

[Henaff et al. "Recurrent Orthogonal Networks and Long-Memory Tasks." International Conference on Machine Learning, 2016.](https://mlanthology.org/icml/2016/henaff2016icml-recurrent/)

BibTeX

@inproceedings{henaff2016icml-recurrent,
  title     = {{Recurrent Orthogonal Networks and Long-Memory Tasks}},
  author    = {Henaff, Mikael and Szlam, Arthur and LeCun, Yann},
  booktitle = {International Conference on Machine Learning},
  year      = {2016},
  pages     = {2034-2042},
  volume    = {48},
  url       = {https://mlanthology.org/icml/2016/henaff2016icml-recurrent/}
}