Deep Learning via Hessian-Free Optimization

Martens, James

Deep Learning via Hessian-Free Optimization

ICML 2010 pp. 735-742

/icml/2010/martens2010icml-deep/

Abstract

We develop a 2nd-order optimization method based on the ``Hessian-free approach, and apply it to training deep auto-encoders. Without using pre-training, we obtain results superior to those reported by Hinton & Salakhutdinov (2006) on the same tasks they considered. Our method is practical, easy to use, scales nicely to very large datasets, and isn't limited in applicability to auto-encoders, or any specific model class. We also discuss the issue of ``pathological curvature as a possible explanation for the difficulty of deep-learning and how 2nd-order optimization, and our method in particular, effectively deals with it.

PDF Semantic Scholar

Cite

Text

Martens. "Deep Learning via Hessian-Free Optimization." International Conference on Machine Learning, 2010.

Markdown

[Martens. "Deep Learning via Hessian-Free Optimization." International Conference on Machine Learning, 2010.](https://mlanthology.org/icml/2010/martens2010icml-deep/)

BibTeX

@inproceedings{martens2010icml-deep,
  title     = {{Deep Learning via Hessian-Free Optimization}},
  author    = {Martens, James},
  booktitle = {International Conference on Machine Learning},
  year      = {2010},
  pages     = {735-742},
  url       = {https://mlanthology.org/icml/2010/martens2010icml-deep/}
}