One-Step Diffusion with Distribution Matching Distillation
Abstract
Diffusion models generate high-quality images but require dozens of forward passes. We introduce Distribution Matching Distillation (DMD) a procedure to transform a diffusion model into a one-step image generator with minimal impact on image quality. We enforce the one-step image generator match the diffusion model at distribution level by minimizing an approximate KL divergence whose gradient can be expressed as the difference between 2 score functions one of the target distribution and the other of the synthetic distribution being produced by our one-step generator. The score functions are parameterized as two diffusion models trained separately on each distribution. Combined with a simple regression loss matching the large-scale structure of the multi-step diffusion outputs our method outperforms all published few-step diffusion approaches reaching 2.62 FID on ImageNet 64x64 and 11.49 FID on zero-shot COCO-30k comparable to Stable Diffusion but orders of magnitude faster. Utilizing FP16 inference our model can generate images at 20 FPS on modern hardware.
Cite
Text
Yin et al. "One-Step Diffusion with Distribution Matching Distillation." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.00632Markdown
[Yin et al. "One-Step Diffusion with Distribution Matching Distillation." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/yin2024cvpr-onestep/) doi:10.1109/CVPR52733.2024.00632BibTeX
@inproceedings{yin2024cvpr-onestep,
title = {{One-Step Diffusion with Distribution Matching Distillation}},
author = {Yin, Tianwei and Gharbi, Michaël and Zhang, Richard and Shechtman, Eli and Durand, Frédo and Freeman, William T. and Park, Taesung},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2024},
pages = {6613-6623},
doi = {10.1109/CVPR52733.2024.00632},
url = {https://mlanthology.org/cvpr/2024/yin2024cvpr-onestep/}
}