A Smoothing Regularizer for Feedforward and Recurrent Neural Networks

Abstract

We derive a smoothing regularizer for dynamic network models by requiring robustness in prediction performance to perturbations of the training data. The regularizer can be viewed as a generalization of the first-order Tikhonov stabilizer to dynamic models. For two layer networks with recurrent connections described by the training criterion with the regularizer is where Φ = U, V, W is the network parameter set, Z(t) are the targets, I(t) = X(s), s = 1,2, …, t represents the current and all historical input information, N is the size of the training data set, [Formula: see text] is the regularizer, and λ is a regularization parameter. The closed-form expression for the regularizer for time-lagged recurrent networks is where ‖ ‖ is the Euclidean matrix norm and γ is a factor that depends upon the maximal value of the first derivatives of the internal unit activations f(). Simplifications of the regularizer are obtained for simultaneous recurrent nets (τ ↦ 0), two-layer feedforward nets, and one layer linear nets. We have successfully tested this regularizer in a number of case studies and found that it performs better than standard quadratic weight decay.

Cite

Text

Wu and Moody. "A Smoothing Regularizer for Feedforward and Recurrent Neural Networks." Neural Computation, 1996. doi:10.1162/NECO.1996.8.3.461

Markdown

[Wu and Moody. "A Smoothing Regularizer for Feedforward and Recurrent Neural Networks." Neural Computation, 1996.](https://mlanthology.org/neco/1996/wu1996neco-smoothing/) doi:10.1162/NECO.1996.8.3.461

BibTeX

@article{wu1996neco-smoothing,
  title     = {{A Smoothing Regularizer for Feedforward and Recurrent Neural Networks}},
  author    = {Wu, Lizhong and Moody, John E.},
  journal   = {Neural Computation},
  year      = {1996},
  pages     = {461-489},
  doi       = {10.1162/NECO.1996.8.3.461},
  volume    = {8},
  url       = {https://mlanthology.org/neco/1996/wu1996neco-smoothing/}
}