Variance-Reduced and Projection-Free Stochastic Optimization

Abstract

The Frank-Wolfe optimization algorithm has recently regained popularity for machine learning applications due to its projection-free property and its ability to handle structured constraints. However, in the stochastic learning setting, it is still relatively understudied compared to the gradient descent counterpart. In this work, leveraging a recent variance reduction technique, we propose two stochastic Frank-Wolfe variants which substantially improve previous results in terms of the number of stochastic gradient evaluations needed to achieve 1-εaccuracy. For example, we improve from O(\frac1ε) to O(\ln\frac1ε) if the objective function is smooth and strongly convex, and from O(\frac1ε^2) to O(\frac1ε^1.5) if the objective function is smooth and Lipschitz. The theoretical improvement is also observed in experiments on real-world datasets for a multiclass classification application.

Cite

Text

Hazan and Luo. "Variance-Reduced and Projection-Free Stochastic Optimization." International Conference on Machine Learning, 2016.

Markdown

[Hazan and Luo. "Variance-Reduced and Projection-Free Stochastic Optimization." International Conference on Machine Learning, 2016.](https://mlanthology.org/icml/2016/hazan2016icml-variancereduced/)

BibTeX

@inproceedings{hazan2016icml-variancereduced,
  title     = {{Variance-Reduced and Projection-Free Stochastic Optimization}},
  author    = {Hazan, Elad and Luo, Haipeng},
  booktitle = {International Conference on Machine Learning},
  year      = {2016},
  pages     = {1263-1271},
  volume    = {48},
  url       = {https://mlanthology.org/icml/2016/hazan2016icml-variancereduced/}
}