State-Regularized Recurrent Neural Networks

Abstract

Recurrent neural networks are a widely used class of neural architectures with two shortcomings. First, it is difficult to understand what exactly they learn. Second, they tend to work poorly on sequences requiring long-term memorization, despite having this capacity in principle. We aim to address both shortcomings with a class of recurrent networks that use a stochastic state transition mechanism between cell applications. This mechanism, which we term state-regularization, makes RNNs transition between a finite set of learnable states. We evaluate state-regularized RNNs on (1) regular languages for the purpose of automata extraction; (2) nonregular languages such as balanced parentheses, palindromes, and the copy task where external memory is required; and (3) real-word sequence learning tasks for sentiment analysis, visual object recognition, and language modeling. We show that state-regularization simplifies the extraction of finite state automata from the RNN’s state transition dynamics; forces RNNs to operate more like automata with external memory and less like finite state machines; and makes RNNs more interpretable.

Cite

Text

Wang and Niepert. "State-Regularized Recurrent Neural Networks." International Conference on Machine Learning, 2019.

Markdown

[Wang and Niepert. "State-Regularized Recurrent Neural Networks." International Conference on Machine Learning, 2019.](https://mlanthology.org/icml/2019/wang2019icml-stateregularized/)

BibTeX

@inproceedings{wang2019icml-stateregularized,
  title     = {{State-Regularized Recurrent Neural Networks}},
  author    = {Wang, Cheng and Niepert, Mathias},
  booktitle = {International Conference on Machine Learning},
  year      = {2019},
  pages     = {6596-6606},
  volume    = {97},
  url       = {https://mlanthology.org/icml/2019/wang2019icml-stateregularized/}
}