Orthant Based Proximal Stochastic Gradient Method for ℓ 1-Regularized Optimization

Tianyi Chen, Tianyu Ding, Bo Ji, Guanyi Wang, Yixin Shi, Jing Tian, Sheng Yi, Xiao Tu, Zhihui Zhu

ECML-PKDD 2020 pp. 57-73

doi:10.1007/978-3-030-67664-3_4 /ecmlpkdd/2020/chen2020ecmlpkdd-orthant/

Abstract

Sparsity-inducing regularization problems are ubiquitous in machine learning applications, ranging from feature selection to model compression. In this paper, we present a novel stochastic method -- Orthant Based Proximal Stochastic Gradient Method (OBProx-SG) -- to solve perhaps the most popular instance, i.e., the l1-regularized problem. The OBProx-SG method contains two steps: (i) a proximal stochastic gradient step to predict a support cover of the solution; and (ii) an orthant step to aggressively enhance the sparsity level via orthant face projection. Compared to the state-of-the-art methods, e.g., Prox-SG, RDA and Prox-SVRG, the OBProx-SG not only converges to the global optimal solutions (in convex scenario) or the stationary points (in non-convex scenario), but also promotes the sparsity of the solutions substantially. Particularly, on a large number of convex problems, OBProx-SG outperforms the existing methods comprehensively in the aspect of sparsity exploration and objective values. Moreover, the experiments on non-convex deep neural networks, e.g., MobileNetV1 and ResNet18, further demonstrate its superiority by achieving the solutions of much higher sparsity without sacrificing generalization accuracy.

PDF ECML-PKDD Semantic Scholar

Cite

Text

Chen et al. "Orthant Based Proximal Stochastic Gradient Method for ℓ 1-Regularized Optimization." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2020. doi:10.1007/978-3-030-67664-3_4

Markdown

[Chen et al. "Orthant Based Proximal Stochastic Gradient Method for ℓ 1-Regularized Optimization." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2020.](https://mlanthology.org/ecmlpkdd/2020/chen2020ecmlpkdd-orthant/) doi:10.1007/978-3-030-67664-3_4

BibTeX

@inproceedings{chen2020ecmlpkdd-orthant,
  title     = {{Orthant Based Proximal Stochastic Gradient Method for ℓ 1-Regularized Optimization}},
  author    = {Chen, Tianyi and Ding, Tianyu and Ji, Bo and Wang, Guanyi and Shi, Yixin and Tian, Jing and Yi, Sheng and Tu, Xiao and Zhu, Zhihui},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2020},
  pages     = {57-73},
  doi       = {10.1007/978-3-030-67664-3_4},
  url       = {https://mlanthology.org/ecmlpkdd/2020/chen2020ecmlpkdd-orthant/}
}