On the Convergence Properties of Contrastive Divergence

Abstract

Contrastive Divergence (CD) is a popular method for estimating the parameters of Markov Random Fields (MRFs) by rapidly approximating an intractable term in the gradient of the log probability. Despite CD’s empirical success, little is known about its theoretical convergence properties. In this paper, we analyze the CD$_1$ update rule for Restricted Boltzmann Machines (RBMs) with binary variables. We show that this update is not the gradient of any function, and construct a counterintuitive “regularization function” that causes CD learning to cycle indefinitely. Nonetheless, we show that the regularized CD update has a fixed point for a large class of regularization functions using Brower’s fixed point theorem.

Cite

Text

Sutskever and Tieleman. "On the Convergence Properties of Contrastive Divergence." Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010.

Markdown

[Sutskever and Tieleman. "On the Convergence Properties of Contrastive Divergence." Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010.](https://mlanthology.org/aistats/2010/sutskever2010aistats-convergence/)

BibTeX

@inproceedings{sutskever2010aistats-convergence,
  title     = {{On the Convergence Properties of Contrastive Divergence}},
  author    = {Sutskever, Ilya and Tieleman, Tijmen},
  booktitle = {Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics},
  year      = {2010},
  pages     = {789-795},
  volume    = {9},
  url       = {https://mlanthology.org/aistats/2010/sutskever2010aistats-convergence/}
}