The Journey, Not the Destination: How Data Guides Diffusion Models
Abstract
Diffusion-based generative models can synthesize photo-realistic images of remarkable quality and diversity. However, *attributing* these images back to the training data---that is, identifying specific training examples which caused an image to be generated---remains a challenge. In this paper, we propose a framework that: (i) frames data attribution in the context of diffusion models, (ii) provides a method for computing such attributions efficiently, and (iii) allows us to *counterfactually* validate them. We then apply our framework to CIFAR-10 and MS COCO datasets.
Cite
Text
Georgiev et al. "The Journey, Not the Destination: How Data Guides Diffusion Models." ICML 2023 Workshops: DeployableGenerativeAI, 2023.Markdown
[Georgiev et al. "The Journey, Not the Destination: How Data Guides Diffusion Models." ICML 2023 Workshops: DeployableGenerativeAI, 2023.](https://mlanthology.org/icmlw/2023/georgiev2023icmlw-journey/)BibTeX
@inproceedings{georgiev2023icmlw-journey,
title = {{The Journey, Not the Destination: How Data Guides Diffusion Models}},
author = {Georgiev, Kristian and Vendrow, Joshua and Salman, Hadi and Park, Sung Min and Madry, Aleksander},
booktitle = {ICML 2023 Workshops: DeployableGenerativeAI},
year = {2023},
url = {https://mlanthology.org/icmlw/2023/georgiev2023icmlw-journey/}
}