Deconstructing Denoising Diffusion Models for Self-Supervised Learning
Abstract
In this study, we examine the representation learning abilities of Denoising Diffusion Models (DDM) that were originally purposed for image generation. Our philosophy is to deconstruct a DDM, gradually transforming it into a classical Denoising Autoencoder (DAE). This deconstructive process allows us to explore how various components of modern DDMs influence self-supervised representation learning. We observe that only a very few modern components are critical for learning good representations, while many others are nonessential. Our study ultimately arrives at an approach that is highly simplified and to a large extent resembles a classical DAE. We hope our study will rekindle interest in a family of classical methods within the realm of modern self-supervised learning.
Cite
Text
Chen et al. "Deconstructing Denoising Diffusion Models for Self-Supervised Learning." International Conference on Learning Representations, 2025.Markdown
[Chen et al. "Deconstructing Denoising Diffusion Models for Self-Supervised Learning." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/chen2025iclr-deconstructing/)BibTeX
@inproceedings{chen2025iclr-deconstructing,
title = {{Deconstructing Denoising Diffusion Models for Self-Supervised Learning}},
author = {Chen, Xinlei and Liu, Zhuang and Xie, Saining and He, Kaiming},
booktitle = {International Conference on Learning Representations},
year = {2025},
url = {https://mlanthology.org/iclr/2025/chen2025iclr-deconstructing/}
}