Generalizing Dataset Distillation via Deep Generative Prior

Abstract

Dataset Distillation aims to distill an entire dataset's knowledge into a few synthetic images. The idea is to synthesize a small number of synthetic data points that, when given to a learning algorithm as training data, result in a model approximating one trained on the original data. Despite a recent upsurge of progress in the field, existing dataset distillation methods fail to generalize to new architectures and scale to high-resolution datasets. To overcome the above issues, we propose to use the learned prior from pre-trained deep generative models to synthesize the distilled data. To achieve this, we present a new optimization algorithm that distills a large number of images into a few intermediate feature vectors in the generative model's latent space. Our method augments existing techniques, significantly improving cross-architecture generalization in all settings.

Cite

Text

Cazenavette et al. "Generalizing Dataset Distillation via Deep Generative Prior." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.00364

Markdown

[Cazenavette et al. "Generalizing Dataset Distillation via Deep Generative Prior." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/cazenavette2023cvpr-generalizing/) doi:10.1109/CVPR52729.2023.00364

BibTeX

@inproceedings{cazenavette2023cvpr-generalizing,
  title     = {{Generalizing Dataset Distillation via Deep Generative Prior}},
  author    = {Cazenavette, George and Wang, Tongzhou and Torralba, Antonio and Efros, Alexei A. and Zhu, Jun-Yan},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2023},
  pages     = {3739-3748},
  doi       = {10.1109/CVPR52729.2023.00364},
  url       = {https://mlanthology.org/cvpr/2023/cazenavette2023cvpr-generalizing/}
}