Learning Discrete Distributions with Infinite Support

Abstract

We present a novel approach to estimating discrete distributions with (potentially) infinite support in the total variation metric. In a departure from the established paradigm, we make no structural assumptions whatsoever on the sampling distribution. In such a setting, distribution-free risk bounds are impossible, and the best one could hope for is a fully empirical data-dependent bound. We derive precisely such bounds, and demonstrate that these are, in a well-defined sense, the best possible. Our main discovery is that the half-norm of the empirical distribution provides tight upper and lower estimates on the empirical risk. Furthermore, this quantity decays at a nearly optimal rate as a function of the true distribution. The optimality follows from a minimax result, of possible independent interest. Additional structural results are provided, including an exact Rademacher complexity calculation and apparently a first connection between the total variation risk and the missing mass.

Cite

Text

Cohen et al. "Learning Discrete Distributions with Infinite Support." Neural Information Processing Systems, 2020.

Markdown

[Cohen et al. "Learning Discrete Distributions with Infinite Support." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/cohen2020neurips-learning/)

BibTeX

@inproceedings{cohen2020neurips-learning,
  title     = {{Learning Discrete Distributions with Infinite Support}},
  author    = {Cohen, Doron and Kontorovich, Aryeh and Wolfer, Geoffrey},
  booktitle = {Neural Information Processing Systems},
  year      = {2020},
  url       = {https://mlanthology.org/neurips/2020/cohen2020neurips-learning/}
}