Learning Exponential Families from Truncated Samples
Abstract
Missing data problems have many manifestations across many scientific fields. A fundamental type of missing data problem arises when samples are \textit{truncated}, i.e., samples that lie in a subset of the support are not observed. Statistical estimation from truncated samples is a classical problem in statistics which dates back to Galton, Pearson, and Fisher. A recent line of work provides the first efficient estimation algorithms for the parameters of a Gaussian distribution and for linear regression with Gaussian noise.In this paper we generalize these results to log-concave exponential families. We provide an estimation algorithm that shows that \textit{extrapolation} is possible for a much larger class of distributions while it maintains a polynomial sample and time complexity on average. Our algorithm is based on Projected Stochastic Gradient Descent and is not only applicable in a more general setting but is also simpler and more efficient than recent algorithms. Our work also has interesting implications for learning general log-concave distributions and sampling given only access to truncated data.
Cite
Text
Lee et al. "Learning Exponential Families from Truncated Samples." Neural Information Processing Systems, 2023.Markdown
[Lee et al. "Learning Exponential Families from Truncated Samples." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/lee2023neurips-learning/)BibTeX
@inproceedings{lee2023neurips-learning,
title = {{Learning Exponential Families from Truncated Samples}},
author = {Lee, Jane and Wibisono, Andre and Zampetakis, Emmanouil},
booktitle = {Neural Information Processing Systems},
year = {2023},
url = {https://mlanthology.org/neurips/2023/lee2023neurips-learning/}
}