Analysis of Learning a Flow-Based Generative Model from Limited Sample Complexity
Abstract
We study the problem of training a flow-based generative model, parametrized by a two-layer autoencoder, to sample from a high-dimensional Gaussian mixture. We provide a sharp end-to-end analysis of the problem. First, we provide a tight closed-form characterization of the learnt velocity field, when parametrized by a shallow denoising auto-encoder trained on a finite number $n$ of samples from the target distribution. Building on this analysis, we provide a sharp description of the corresponding generative flow, which pushes the base Gaussian density forward to an approximation of the target density. In particular, we provide closed-form formulae for the distance between the means of the generated mixture and the mean of the target mixture, which we show decays as $\Theta_n(\frac{1}{n})$. Finally, this rate is shown to be in fact Bayes-optimal.
Cite
Text
Cui et al. "Analysis of Learning a Flow-Based Generative Model from Limited Sample Complexity." International Conference on Learning Representations, 2024.Markdown
[Cui et al. "Analysis of Learning a Flow-Based Generative Model from Limited Sample Complexity." International Conference on Learning Representations, 2024.](https://mlanthology.org/iclr/2024/cui2024iclr-analysis/)BibTeX
@inproceedings{cui2024iclr-analysis,
title = {{Analysis of Learning a Flow-Based Generative Model from Limited Sample Complexity}},
author = {Cui, Hugo and Krzakala, Florent and Vanden-Eijnden, Eric and Zdeborova, Lenka},
booktitle = {International Conference on Learning Representations},
year = {2024},
url = {https://mlanthology.org/iclr/2024/cui2024iclr-analysis/}
}