On the Memory Mechanism of Tensor-Power Recurrent Models

Abstract

Tensor-power (TP) recurrent model is a family of non-linear dynamical systems, of which the recurrence relation consists of a p-fold (a.k.a., degree-p) tensor product. Despite such the model frequently appears in the advanced recurrent neural networks (RNNs), to this date there is limited study on its memory property, a critical characteristic in sequence tasks. In this work, we conduct a thorough investigation of the memory mechanism of TP recurrent models. Theoretically, we prove that a large degree p is an essential condition to achieve the long memory effect, yet it would lead to unstable dynamical behaviors. Empirically, we tackle this issue by extending the degree p from discrete to a differentiable domain, such that it is efficiently learnable from a variety of datasets. Taken together, the new model is expected to benefit from the long memory effect in a stable manner. We experimentally show that the proposed model achieves competitive performance compared to various advanced RNNs in both the single-cell and seq2seq architectures.

Cite

Text

Qiu et al. "On the Memory Mechanism of Tensor-Power Recurrent Models." Artificial Intelligence and Statistics, 2021.

Markdown

[Qiu et al. "On the Memory Mechanism of Tensor-Power Recurrent Models." Artificial Intelligence and Statistics, 2021.](https://mlanthology.org/aistats/2021/qiu2021aistats-memory/)

BibTeX

@inproceedings{qiu2021aistats-memory,
  title     = {{On the Memory Mechanism of Tensor-Power Recurrent Models}},
  author    = {Qiu, Hejia and Li, Chao and Weng, Ying and Sun, Zhun and He, Xingyu and Zhao, Qibin},
  booktitle = {Artificial Intelligence and Statistics},
  year      = {2021},
  pages     = {3682-3690},
  volume    = {130},
  url       = {https://mlanthology.org/aistats/2021/qiu2021aistats-memory/}
}