Synthetic Data from Diffusion Models Improves ImageNet Classification
Abstract
Deep generative models are becoming increasingly powerful, now generating diverse, high fidelity, photo-realistic samples given text prompts. Nevertheless, samples from such models have not been shown to significantly improve model training for challenging and well-studied discriminative tasks like ImageNet classification. In this paper we show that augmenting the ImageNet training set with samples from a generative diffusion model can yield substantial improvements in ImageNet classification accuracy over strong ResNet and Vision Transformer baselines. To this end we explore the fine-tuning of large-scale text-to-image diffusion models, yielding class-conditional ImageNet models with state-of-the-art FID score (1.76 at 256×256 resolution) and Inception Score (239 at 256×256). The model also yields a new state-of-the-art in Classification Accuracy Scores, i.e., ImageNet test accuracy for a ResNet-50 architecture trained solely on synthetic data (64.96 top-1 accuracy for 256×256 samples, improving to 69.24 for 1024×1024 samples). Adding up to three times as many synthetic samples as real training samples consistently improves ImageNet classification accuracy across multiple architectures.
Cite
Text
Azizi et al. "Synthetic Data from Diffusion Models Improves ImageNet Classification." Transactions on Machine Learning Research, 2023.Markdown
[Azizi et al. "Synthetic Data from Diffusion Models Improves ImageNet Classification." Transactions on Machine Learning Research, 2023.](https://mlanthology.org/tmlr/2023/azizi2023tmlr-synthetic/)BibTeX
@article{azizi2023tmlr-synthetic,
title = {{Synthetic Data from Diffusion Models Improves ImageNet Classification}},
author = {Azizi, Shekoofeh and Kornblith, Simon and Saharia, Chitwan and Norouzi, Mohammad and Fleet, David J.},
journal = {Transactions on Machine Learning Research},
year = {2023},
url = {https://mlanthology.org/tmlr/2023/azizi2023tmlr-synthetic/}
}