Analysis of Learning a Flow-Based Generative Model from Limited Sample Complexity

Abstract

We study the problem of training a flow-based generative model, parametrized by a two-layer autoencoder, to sample from a high-dimensional Gaussian mixture. We provide a sharp end-to-end analysis of the problem. First, we provide a tight closed-form characterization of the learnt velocity field, when parametrized by a shallow denoising auto-encoder trained on a finite number $n$ of samples from the target distribution. Building on this analysis, we provide a sharp description of the corresponding generative flow, which pushes the base Gaussian density forward to an approximation of the target density. In particular, we provide closed-form formulae for the distance between the means of the generated mixture and the mean of the target mixture, which we show decays as $\Theta_n(\frac{1}{n})$. Finally, this rate is shown to be in fact Bayes-optimal.

Cite

Text

Cui et al. "Analysis of Learning a Flow-Based Generative Model from Limited Sample Complexity." International Conference on Learning Representations, 2024.

Markdown

[Cui et al. "Analysis of Learning a Flow-Based Generative Model from Limited Sample Complexity." International Conference on Learning Representations, 2024.](https://mlanthology.org/iclr/2024/cui2024iclr-analysis/)

BibTeX

@inproceedings{cui2024iclr-analysis,
  title     = {{Analysis of Learning a Flow-Based Generative Model from Limited Sample Complexity}},
  author    = {Cui, Hugo and Krzakala, Florent and Vanden-Eijnden, Eric and Zdeborova, Lenka},
  booktitle = {International Conference on Learning Representations},
  year      = {2024},
  url       = {https://mlanthology.org/iclr/2024/cui2024iclr-analysis/}
}