Diagrammatic Derivation of Gradient Algorithms for Neural Networks
Abstract
Deriving gradient algorithms for time-dependent neural network structures typically requires numerous chain rule expansions, diligent bookkeeping, and careful manipulation of terms. In this paper, we show how to derive such algorithms via a set of simple block diagram manipulation rules. The approach provides a common framework to derive popular algorithms including backpropagation and backpropagation-through-time without a single chain rule expansion. Additional examples are provided for a variety of complicated architectures to illustrate both the generality and the simplicity of the approach.
Cite
Text
Wan and Beaufays. "Diagrammatic Derivation of Gradient Algorithms for Neural Networks." Neural Computation, 1996. doi:10.1162/NECO.1996.8.1.182Markdown
[Wan and Beaufays. "Diagrammatic Derivation of Gradient Algorithms for Neural Networks." Neural Computation, 1996.](https://mlanthology.org/neco/1996/wan1996neco-diagrammatic/) doi:10.1162/NECO.1996.8.1.182BibTeX
@article{wan1996neco-diagrammatic,
title = {{Diagrammatic Derivation of Gradient Algorithms for Neural Networks}},
author = {Wan, Eric A. and Beaufays, Françoise},
journal = {Neural Computation},
year = {1996},
pages = {182-201},
doi = {10.1162/NECO.1996.8.1.182},
volume = {8},
url = {https://mlanthology.org/neco/1996/wan1996neco-diagrammatic/}
}