Hierarchical Recurrent Neural Networks for Long-Term Dependencies

Abstract

We have already shown that extracting long-term dependencies from se(cid:173) quential data is difficult, both for determimstic dynamical systems such as recurrent networks, and probabilistic models such as hidden Markov models (HMMs) or input/output hidden Markov models (IOHMMs). In practice, to avoid this problem, researchers have used domain specific a-priori knowledge to give meaning to the hidden or state variables rep(cid:173) resenting past context. In this paper, we propose to use a more general type of a-priori knowledge, namely that the temporal dependencIes are structured hierarchically. This implies that long-term dependencies are represented by variables with a long time scale. This principle is applied to a recurrent network which includes delays and multiple time scales. Ex(cid:173) periments confirm the advantages of such structures. A similar approach is proposed for HMMs and IOHMMs.

Cite

Text

El Hihi and Bengio. "Hierarchical Recurrent Neural Networks for Long-Term Dependencies." Neural Information Processing Systems, 1995.

Markdown

[El Hihi and Bengio. "Hierarchical Recurrent Neural Networks for Long-Term Dependencies." Neural Information Processing Systems, 1995.](https://mlanthology.org/neurips/1995/hihi1995neurips-hierarchical/)

BibTeX

@inproceedings{hihi1995neurips-hierarchical,
  title     = {{Hierarchical Recurrent Neural Networks for Long-Term Dependencies}},
  author    = {El Hihi, Salah and Bengio, Yoshua},
  booktitle = {Neural Information Processing Systems},
  year      = {1995},
  pages     = {493-499},
  url       = {https://mlanthology.org/neurips/1995/hihi1995neurips-hierarchical/}
}