Non-Convex Conditional Gradient Sliding
Abstract
We investigate a projection free optimization method, namely non-convex conditional gradient sliding (NCGS) for non-convex optimization problems on the batch, stochastic and finite-sum settings. Conditional gradient sliding (CGS) method, by integrating Nesterov’s accelerated gradient method with Frank-Wolfe (FW) method in a smart way, outperforms FW for convex optimization, by reducing the amount of gradient computations. However, the study of CGS in the non-convex setting is limited. In this paper, we propose the non-convex conditional gradient sliding (NCGS) methods and analyze their convergence properties. We also leverage the idea of variance reduction from the recent progress in convex optimization to obtain a new algorithm termed variance reduced NCGS (NCGS-VR), and obtain faster convergence rate than the batch NCGS in the finite-sum setting. We show that NCGS algorithms outperform their Frank-Wolfe counterparts both in theory and in practice, for all three settings, namely the batch, stochastic and finite-sum setting. This significantly improves our understanding of optimizing non-convex functions with complicated feasible sets (where projection is prohibitively expensive).
Cite
Text
Qu et al. "Non-Convex Conditional Gradient Sliding." International Conference on Machine Learning, 2018.Markdown
[Qu et al. "Non-Convex Conditional Gradient Sliding." International Conference on Machine Learning, 2018.](https://mlanthology.org/icml/2018/qu2018icml-nonconvex/)BibTeX
@inproceedings{qu2018icml-nonconvex,
title = {{Non-Convex Conditional Gradient Sliding}},
author = {Qu, Chao and Li, Yan and Xu, Huan},
booktitle = {International Conference on Machine Learning},
year = {2018},
pages = {4208-4217},
volume = {80},
url = {https://mlanthology.org/icml/2018/qu2018icml-nonconvex/}
}