Large Scale Optimization with Proximal Stochastic Newton-Type Gradient Descent
Abstract
In this work, we generalized and unified two recent completely different works of Jascha [ 10 ] and Lee [ 2 ] respectively into one by proposing the prox imal s to chastic N ewton-type gradient (PROXTONE) method for optimizing the sums of two convex functions: one is the average of a huge number of smooth convex functions, and the other is a nonsmooth convex function. Our PROXTONE incorporates second order information to obtain stronger convergence results, that it achieves a linear convergence rate not only in the value of the objective function, but also for the solution . The proofs are simple and intuitive, and the results and technique can be served as a initiate for the research on the proximal stochastic methods that employ second order information. The methods and principles proposed in this paper can be used to do logistic regression, training of deep neural network and so on. Our numerical experiments shows that the PROXTONE achieves better computation performance than existing methods.
Cite
Text
Shi and Liu. "Large Scale Optimization with Proximal Stochastic Newton-Type Gradient Descent." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2015. doi:10.1007/978-3-319-23528-8_43Markdown
[Shi and Liu. "Large Scale Optimization with Proximal Stochastic Newton-Type Gradient Descent." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2015.](https://mlanthology.org/ecmlpkdd/2015/shi2015ecmlpkdd-large/) doi:10.1007/978-3-319-23528-8_43BibTeX
@inproceedings{shi2015ecmlpkdd-large,
title = {{Large Scale Optimization with Proximal Stochastic Newton-Type Gradient Descent}},
author = {Shi, Ziqiang and Liu, Rujie},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2015},
pages = {691-704},
doi = {10.1007/978-3-319-23528-8_43},
url = {https://mlanthology.org/ecmlpkdd/2015/shi2015ecmlpkdd-large/}
}