On the Discrepancy and Connection Between Memorization and Generation in Diffusion Models

Abstract

Diffusion models (DMs), as a state-of-the-art generative modeling method, have enjoyed tremendous success in multiple generating tasks. However, the memorization behavior of DMs, that the generation replicates the training data, raises serious privacy concerns and contradicts the actual generalizability of DMs. These prompt us to delve deeper into the generalizability and memorization of DMs, particularly in cases where the closed-form solution of DMs' score function can be explicitly solved. Through a series of comprehensive experiments, we demonstrate the discrepancies and connections between the optimal score and the trained score, noting that the trained one is smoother, which benefits the generalizability of DMs. We also further explore how mixing the optimal score with the trained score during the sampling phase affects generation. Our experimental findings provide novel insights into the understanding of DMs' generalizability.

Cite

Text

Wang et al. "On the Discrepancy and Connection Between Memorization and Generation  in Diffusion Models." ICML 2024 Workshops: FM-Wild, 2024.

Markdown

[Wang et al. "On the Discrepancy and Connection Between Memorization and Generation  in Diffusion Models." ICML 2024 Workshops: FM-Wild, 2024.](https://mlanthology.org/icmlw/2024/wang2024icmlw-discrepancy/)

BibTeX

@inproceedings{wang2024icmlw-discrepancy,
  title     = {{On the Discrepancy and Connection Between Memorization and Generation  in Diffusion Models}},
  author    = {Wang, Hanyu and Han, Yujin and Zou, Difan},
  booktitle = {ICML 2024 Workshops: FM-Wild},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/wang2024icmlw-discrepancy/}
}