Energetic Natural Gradient Descent

Abstract

We propose a new class of algorithms for minimizing or maximizing functions of parametric probabilistic models. These new algorithms are natural gradient algorithms that leverage more information than prior methods by using a new metric tensor in place of the commonly used Fisher information matrix. This new metric tensor is derived by computing directions of steepest ascent where the distance between distributions is measured using an approximation of energy distance (as opposed to Kullback-Leibler divergence, which produces the Fisher information matrix), and so we refer to our new ascent direction as the energetic natural gradient.

Cite

Text

Thomas et al. "Energetic Natural Gradient Descent." International Conference on Machine Learning, 2016.

Markdown

[Thomas et al. "Energetic Natural Gradient Descent." International Conference on Machine Learning, 2016.](https://mlanthology.org/icml/2016/thomas2016icml-energetic/)

BibTeX

@inproceedings{thomas2016icml-energetic,
  title     = {{Energetic Natural Gradient Descent}},
  author    = {Thomas, Philip and Silva, Bruno Castro and Dann, Christoph and Brunskill, Emma},
  booktitle = {International Conference on Machine Learning},
  year      = {2016},
  pages     = {2887-2895},
  volume    = {48},
  url       = {https://mlanthology.org/icml/2016/thomas2016icml-energetic/}
}