A Smoothing Regularizer for Feedforward and Recurrent Neural Networks
Abstract
We derive a smoothing regularizer for dynamic network models by requiring robustness in prediction performance to perturbations of the training data. The regularizer can be viewed as a generalization of the first-order Tikhonov stabilizer to dynamic models. For two layer networks with recurrent connections described by the training criterion with the regularizer is where Φ = U, V, W is the network parameter set, Z(t) are the targets, I(t) = X(s), s = 1,2, …, t represents the current and all historical input information, N is the size of the training data set, [Formula: see text] is the regularizer, and λ is a regularization parameter. The closed-form expression for the regularizer for time-lagged recurrent networks is where ‖ ‖ is the Euclidean matrix norm and γ is a factor that depends upon the maximal value of the first derivatives of the internal unit activations f(). Simplifications of the regularizer are obtained for simultaneous recurrent nets (τ ↦ 0), two-layer feedforward nets, and one layer linear nets. We have successfully tested this regularizer in a number of case studies and found that it performs better than standard quadratic weight decay.
Cite
Text
Wu and Moody. "A Smoothing Regularizer for Feedforward and Recurrent Neural Networks." Neural Computation, 1996. doi:10.1162/NECO.1996.8.3.461Markdown
[Wu and Moody. "A Smoothing Regularizer for Feedforward and Recurrent Neural Networks." Neural Computation, 1996.](https://mlanthology.org/neco/1996/wu1996neco-smoothing/) doi:10.1162/NECO.1996.8.3.461BibTeX
@article{wu1996neco-smoothing,
title = {{A Smoothing Regularizer for Feedforward and Recurrent Neural Networks}},
author = {Wu, Lizhong and Moody, John E.},
journal = {Neural Computation},
year = {1996},
pages = {461-489},
doi = {10.1162/NECO.1996.8.3.461},
volume = {8},
url = {https://mlanthology.org/neco/1996/wu1996neco-smoothing/}
}