Maximum Entropy Distribution Estimation with Generalized Regularization

Abstract

We present a unified and complete account of maximum entropy distribution estimation subject to constraints represented by convex potential functions or, alternatively, by convex regularization. We provide fully general performance guarantees and an algorithm with a complete convergence proof. As special cases, we can easily derive performance guarantees for many known regularization types, including ℓ_1, ℓ_2, $\ell_{\rm 2}^{\rm 2}$ and ℓ_1 + $\ell_{\rm 2}^{\rm 2}$ style regularization. Furthermore, our general approach enables us to use information about the structure of the feature space or about sample selection bias to derive entirely new regularization functions with superior guarantees. We propose an algorithm solving a large and general subclass of generalized maxent problems, including all discussed in the paper, and prove its convergence. Our approach generalizes techniques based on information geometry and Bregman divergences as well as those based more directly on compactness.

Cite

Text

Dudík and Schapire. "Maximum Entropy Distribution Estimation with Generalized Regularization." Annual Conference on Computational Learning Theory, 2006. doi:10.1007/11776420_12

Markdown

[Dudík and Schapire. "Maximum Entropy Distribution Estimation with Generalized Regularization." Annual Conference on Computational Learning Theory, 2006.](https://mlanthology.org/colt/2006/dudik2006colt-maximum/) doi:10.1007/11776420_12

BibTeX

@inproceedings{dudik2006colt-maximum,
  title     = {{Maximum Entropy Distribution Estimation with Generalized Regularization}},
  author    = {Dudík, Miroslav and Schapire, Robert E.},
  booktitle = {Annual Conference on Computational Learning Theory},
  year      = {2006},
  pages     = {123-138},
  doi       = {10.1007/11776420_12},
  url       = {https://mlanthology.org/colt/2006/dudik2006colt-maximum/}
}