Self-Guided Diffusion Models

Abstract

Diffusion models have demonstrated remarkable progress in image generation quality, especially when guidance is used to control the generative process. However, guidance requires a large amount of image-annotation pairs for training and is thus dependent on their availability and correctness. In this paper, we eliminate the need for such annotation by instead exploiting the flexibility of self-supervision signals to design a framework for self-guided diffusion models. By leveraging a feature extraction function and a self-annotation function, our method provides guidance signals at various image granularities: from the level of holistic images to object boxes and even segmentation masks. Our experiments on single-label and multi-label image datasets demonstrate that self-labeled guidance always outperforms diffusion models without guidance and may even surpass guidance based on ground-truth labels. When equipped with self-supervised box or mask proposals, our method further generates visually diverse yet semantically consistent images, without the need for any class, box, or segment label annotation. Self-guided diffusion is simple, flexible and expected to profit from deployment at scale.

Cite

Text

Hu et al. "Self-Guided Diffusion Models." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.01766

Markdown

[Hu et al. "Self-Guided Diffusion Models." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/hu2023cvpr-selfguided/) doi:10.1109/CVPR52729.2023.01766

BibTeX

@inproceedings{hu2023cvpr-selfguided,
  title     = {{Self-Guided Diffusion Models}},
  author    = {Hu, Vincent Tao and Zhang, David W. and Asano, Yuki M. and Burghouts, Gertjan J. and Snoek, Cees G. M.},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2023},
  pages     = {18413-18422},
  doi       = {10.1109/CVPR52729.2023.01766},
  url       = {https://mlanthology.org/cvpr/2023/hu2023cvpr-selfguided/}
}