Efficient and Numerically Stable Sparse Learning
Abstract
We consider the problem of numerical stability and model density growth when training a sparse linear model from massive data. We focus on scalable algorithms that optimize certain loss function using gradient descent, with either ℓ_0 or ℓ_1 regularization. We observed numerical stability problems in several existing methods, leading to divergence and low accuracy. In addition, these methods typically have weak controls over sparsity, such that model density grows faster than necessary. We propose a framework to address the above problems. First, the update rule is numerically stable with convergence guarantee and results in more reasonable models. Second, besides ℓ_1 regularization, it exploits the sparsity of data distribution and achieves a higher degree of sparsity with a PAC generalization error bound. Lastly, it is parallelizable and suitable for training large margin classifiers on huge datasets. Experiments show that the proposed method converges consistently and outperforms other baselines using 10% of features by as much as 6% reduction in error rate on average. Datasets and software are available from the authors.
Cite
Text
Xie et al. "Efficient and Numerically Stable Sparse Learning." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2010. doi:10.1007/978-3-642-15939-8_31Markdown
[Xie et al. "Efficient and Numerically Stable Sparse Learning." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2010.](https://mlanthology.org/ecmlpkdd/2010/xie2010ecmlpkdd-efficient/) doi:10.1007/978-3-642-15939-8_31BibTeX
@inproceedings{xie2010ecmlpkdd-efficient,
title = {{Efficient and Numerically Stable Sparse Learning}},
author = {Xie, Sihong and Fan, Wei and Verscheure, Olivier and Ren, Jiangtao},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2010},
pages = {483-498},
doi = {10.1007/978-3-642-15939-8_31},
url = {https://mlanthology.org/ecmlpkdd/2010/xie2010ecmlpkdd-efficient/}
}