Learning Exponential Families from Truncated Samples

Abstract

Missing data problems have many manifestations across many scientific fields. A fundamental type of missing data problem arises when samples are \textit{truncated}, i.e., samples that lie in a subset of the support are not observed. Statistical estimation from truncated samples is a classical problem in statistics which dates back to Galton, Pearson, and Fisher. A recent line of work provides the first efficient estimation algorithms for the parameters of a Gaussian distribution and for linear regression with Gaussian noise.In this paper we generalize these results to log-concave exponential families. We provide an estimation algorithm that shows that \textit{extrapolation} is possible for a much larger class of distributions while it maintains a polynomial sample and time complexity on average. Our algorithm is based on Projected Stochastic Gradient Descent and is not only applicable in a more general setting but is also simpler and more efficient than recent algorithms. Our work also has interesting implications for learning general log-concave distributions and sampling given only access to truncated data.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Lee et al. "Learning Exponential Families from Truncated Samples." Neural Information Processing Systems, 2023.

Markdown

[Lee et al. "Learning Exponential Families from Truncated Samples." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/lee2023neurips-learning/)

BibTeX

@inproceedings{lee2023neurips-learning,
  title     = {{Learning Exponential Families from Truncated Samples}},
  author    = {Lee, Jane and Wibisono, Andre and Zampetakis, Emmanouil},
  booktitle = {Neural Information Processing Systems},
  year      = {2023},
  url       = {https://mlanthology.org/neurips/2023/lee2023neurips-learning/}
}