Stochastic Difference of Convex Algorithm and Its Application to Training Deep Boltzmann Machines

Nitanda, Atsushi; Suzuki, Taiji

Stochastic Difference of Convex Algorithm and Its Application to Training Deep Boltzmann Machines

AISTATS 2017 pp. 470-478

/aistats/2017/nitanda2017aistats-stochastic/

Abstract

Difference of convex functions (DC) programming is an important approach to nonconvex optimization problems because these structures can be encountered in several fields. Effective optimization methods, called DC algorithms, have been developed in deterministic optimization literature. In machine learning, a lot of important learning problems such as the Boltzmann machines (BMs) can be formulated as DC programming. However, there is no DC-like algorithm guaranteed by convergence rate analysis for stochastic problems that are more suitable settings for machine learning tasks. In this paper, we propose a stochastic variant of DC algorithm and give computational complexities to converge to a stationary point under several situations. Moreover, we show our method includes expectation-maximization (EM) and Monte Carlo EM (MCEM) algorithm as special cases on training BMs. In other words, we extend EM/MCEM algorithm to more effective methods from DC viewpoint with theoretical convergence guarantees. Experimental results indicate that our method performs well for training binary restricted Boltzmann machines and deep Boltzmann machines without pre-training.

PDF AISTATS Semantic Scholar

Cite

Text

Nitanda and Suzuki. "Stochastic Difference of Convex Algorithm and Its Application to Training Deep Boltzmann Machines." International Conference on Artificial Intelligence and Statistics, 2017.

Markdown

[Nitanda and Suzuki. "Stochastic Difference of Convex Algorithm and Its Application to Training Deep Boltzmann Machines." International Conference on Artificial Intelligence and Statistics, 2017.](https://mlanthology.org/aistats/2017/nitanda2017aistats-stochastic/)

BibTeX

@inproceedings{nitanda2017aistats-stochastic,
  title     = {{Stochastic Difference of Convex Algorithm and Its Application to Training Deep Boltzmann Machines}},
  author    = {Nitanda, Atsushi and Suzuki, Taiji},
  booktitle = {International Conference on Artificial Intelligence and Statistics},
  year      = {2017},
  pages     = {470-478},
  url       = {https://mlanthology.org/aistats/2017/nitanda2017aistats-stochastic/}
}