A Light Touch for Heavily Constrained SGD

Abstract

Minimizing empirical risk subject to a set of constraints can be a useful strategy for learning restricted classes of functions, such as monotonic functions, submodular functions, classifiers that guarantee a certain class label for some subset of examples, etc. However, these restrictions may result in a very large number of constraints. Projected stochastic gradient descent (SGD) is often the default choice for large-scale optimization in machine learning, but requires a projection after each update. For heavily-constrained objectives, we propose an efficient extension of SGD that stays close to the feasible region while only applying constraints probabilistically at each iteration. Theoretical analysis shows a compelling trade-off between per-iteration work and the number of iterations needed on problems with a large number of constraints.

Cite

Text

Cotter et al. "A Light Touch for Heavily Constrained SGD." Annual Conference on Computational Learning Theory, 2016.

Markdown

[Cotter et al. "A Light Touch for Heavily Constrained SGD." Annual Conference on Computational Learning Theory, 2016.](https://mlanthology.org/colt/2016/cotter2016colt-light/)

BibTeX

@inproceedings{cotter2016colt-light,
  title     = {{A Light Touch for Heavily Constrained SGD}},
  author    = {Cotter, Andrew and Gupta, Maya R. and Pfeifer, Jan},
  booktitle = {Annual Conference on Computational Learning Theory},
  year      = {2016},
  pages     = {729-771},
  url       = {https://mlanthology.org/colt/2016/cotter2016colt-light/}
}