Learning Exponential Families from Truncated Samples

Abstract

Missing data problems have many manifestations across many scientific fields. A fundamental type of missing data problem arises when samples are \textit{truncated}, i.e., samples that lie in a subset of the support are not observed. Statistical estimation from truncated samples is a classical problem in statistics which dates back to Galton, Pearson, and Fisher. A recent line of work provides the first efficient estimation algorithms for the parameters of a Gaussian distribution and for linear regression with Gaussian noise. In this paper we generalize these results to log-concave exponential families. We provide an estimation algorithm that shows that \textit{extrapolation} is possible for a much larger class of distributions while it maintains a polynomial sample and time complexity. Our algorithm is based on Projected Stochastic Gradient Descent and is not only applicable in a more general setting but is also simpler and more efficient than recent algorithms. Our work also has interesting implications for learning general log-concave distributions and sampling given only access to truncated data.

Cite

Text

Lee et al. "Learning Exponential Families from Truncated Samples." ICML 2023 Workshops: AdvML-Frontiers, 2023.

Markdown

[Lee et al. "Learning Exponential Families from Truncated Samples." ICML 2023 Workshops: AdvML-Frontiers, 2023.](https://mlanthology.org/icmlw/2023/lee2023icmlw-learning/)

BibTeX

@inproceedings{lee2023icmlw-learning,
  title     = {{Learning Exponential Families from Truncated Samples}},
  author    = {Lee, Jane H. and Wibisono, Andre and Zampetakis, Manolis},
  booktitle = {ICML 2023 Workshops: AdvML-Frontiers},
  year      = {2023},
  url       = {https://mlanthology.org/icmlw/2023/lee2023icmlw-learning/}
}