Recurrent Orthogonal Networks and Long-Memory Tasks
Abstract
Although RNNs have been shown to be power- ful tools for processing sequential data, finding architectures or optimization strategies that al- low them to model very long term dependencies is still an active area of research. In this work, we carefully analyze two synthetic datasets orig- inally outlined in (Hochreiter & Schmidhuber, 1997) which are used to evaluate the ability of RNNs to store information over many time steps. We explicitly construct RNN solutions to these problems, and using these constructions, illumi- nate both the problems themselves and the way in which RNNs store different types of information in their hidden states. These constructions fur- thermore explain the success of recent methods that specify unitary initializations or constraints on the transition matrices.
Cite
Text
Henaff et al. "Recurrent Orthogonal Networks and Long-Memory Tasks." International Conference on Machine Learning, 2016.Markdown
[Henaff et al. "Recurrent Orthogonal Networks and Long-Memory Tasks." International Conference on Machine Learning, 2016.](https://mlanthology.org/icml/2016/henaff2016icml-recurrent/)BibTeX
@inproceedings{henaff2016icml-recurrent,
title = {{Recurrent Orthogonal Networks and Long-Memory Tasks}},
author = {Henaff, Mikael and Szlam, Arthur and LeCun, Yann},
booktitle = {International Conference on Machine Learning},
year = {2016},
pages = {2034-2042},
volume = {48},
url = {https://mlanthology.org/icml/2016/henaff2016icml-recurrent/}
}