Distribution Learning with Valid Outputs Beyond the Worst-Case
Abstract
Generative models at times produce "invalid" outputs, such as images with generation artifacts and unnatural sounds. Validity-constrained distribution learning attempts to address this problem by requiring that the learned distribution have a provably small fraction of its mass in invalid parts of space -- something which standard loss minimization does not always ensure. To this end, a learner in this model can guide the learning via "validity queries", which allow it to ascertain the validity of individual examples. Prior work on this problem takes a worst-case stance, showing that proper learning requires an exponential number of validity queries, and demonstrating an improper algorithm which -- while generating guarantees in a wide-range of settings -- makes a relatively large polynomial number of validity queries. In this work, we take a first step towards characterizing regimes where guaranteeing validity is easier than in the worst-case. We show that when the data distribution lies in the model class and the log-loss is minimized, the number samples required to ensure validity has a weak dependence on the validity requirement. Additionally, we show that when the validity region belongs to a VC-class, a limited number of validity queries are often sufficient.
Cite
Text
Rittler and Chaudhuri. "Distribution Learning with Valid Outputs Beyond the Worst-Case." Neural Information Processing Systems, 2024. doi:10.52202/079017-0719Markdown
[Rittler and Chaudhuri. "Distribution Learning with Valid Outputs Beyond the Worst-Case." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/rittler2024neurips-distribution/) doi:10.52202/079017-0719BibTeX
@inproceedings{rittler2024neurips-distribution,
title = {{Distribution Learning with Valid Outputs Beyond the Worst-Case}},
author = {Rittler, Nick and Chaudhuri, Kamalika},
booktitle = {Neural Information Processing Systems},
year = {2024},
doi = {10.52202/079017-0719},
url = {https://mlanthology.org/neurips/2024/rittler2024neurips-distribution/}
}