Diagrammatic Derivation of Gradient Algorithms for Neural Networks

Wan, Eric A.; Beaufays, Françoise

doi:10.1162/NECO.1996.8.1.182

Diagrammatic Derivation of Gradient Algorithms for Neural Networks

Eric A. Wan, Françoise Beaufays

NeCo 1996 pp. 182-201

doi:10.1162/NECO.1996.8.1.182 /neco/1996/wan1996neco-diagrammatic/

Abstract

Deriving gradient algorithms for time-dependent neural network structures typically requires numerous chain rule expansions, diligent bookkeeping, and careful manipulation of terms. In this paper, we show how to derive such algorithms via a set of simple block diagram manipulation rules. The approach provides a common framework to derive popular algorithms including backpropagation and backpropagation-through-time without a single chain rule expansion. Additional examples are provided for a variety of complicated architectures to illustrate both the generality and the simplicity of the approach.

NeCo Semantic Scholar

Cite

Text

Wan and Beaufays. "Diagrammatic Derivation of Gradient Algorithms for Neural Networks." Neural Computation, 1996. doi:10.1162/NECO.1996.8.1.182

Markdown

[Wan and Beaufays. "Diagrammatic Derivation of Gradient Algorithms for Neural Networks." Neural Computation, 1996.](https://mlanthology.org/neco/1996/wan1996neco-diagrammatic/) doi:10.1162/NECO.1996.8.1.182

BibTeX

@article{wan1996neco-diagrammatic,
  title     = {{Diagrammatic Derivation of Gradient Algorithms for Neural Networks}},
  author    = {Wan, Eric A. and Beaufays, Françoise},
  journal   = {Neural Computation},
  year      = {1996},
  pages     = {182-201},
  doi       = {10.1162/NECO.1996.8.1.182},
  volume    = {8},
  url       = {https://mlanthology.org/neco/1996/wan1996neco-diagrammatic/}
}