Fast Margin Maximization via Dual Acceleration
Abstract
We present and analyze a momentum-based gradient method for training linear classifiers with an exponentially-tailed loss (e.g., the exponential or logistic loss), which maximizes the classification margin on separable data at a rate of O(1/t^2). This contrasts with a rate of O(1/log(t)) for standard gradient descent, and O(1/t) for normalized gradient descent. The momentum-based method is derived via the convex dual of the maximum-margin problem, and specifically by applying Nesterov acceleration to this dual, which manages to result in a simple and intuitive method in the primal. This dual view can also be used to derive a stochastic variant, which performs adaptive non-uniform sampling via the dual variables.
Cite
Text
Ji et al. "Fast Margin Maximization via Dual Acceleration." International Conference on Machine Learning, 2021.Markdown
[Ji et al. "Fast Margin Maximization via Dual Acceleration." International Conference on Machine Learning, 2021.](https://mlanthology.org/icml/2021/ji2021icml-fast/)BibTeX
@inproceedings{ji2021icml-fast,
title = {{Fast Margin Maximization via Dual Acceleration}},
author = {Ji, Ziwei and Srebro, Nathan and Telgarsky, Matus},
booktitle = {International Conference on Machine Learning},
year = {2021},
pages = {4860-4869},
volume = {139},
url = {https://mlanthology.org/icml/2021/ji2021icml-fast/}
}