Enhancing Dataset Distillation via Non-Critical Region Refinement

Abstract

Dataset distillation has gained popularity as a technique for compressing large datasets into smaller, more efficient representations while retaining essential information for model training. Data features can be broadly divided into two types: instance-specific features, which capture unique, fine-grained details of individual examples, and class-general features, which represent shared, broad patterns across a class. However, previous approaches often struggle to balance these; some focus solely on class-general features, missing finer instance details, while others concentrate on instance-specific features, overlooking the shared characteristics essential for class-level understanding. In this paper, we propose the Non-Critical Region Refinement Dataset Distillation (NRR-DD) method, which preserves the instance-specific and fine-grained regions in synthetic data while enriching non-critical regions with more class-general information. This approach enables our models to leverage all pixel information to capture both types of features, thereby improving overall performance. Furthermore, we introduce Distance-Based Representative (DBR) knowledge transfer, which eliminates the need for soft labels in training by relying solely on the distance between synthetic data predictions and one-hot encoded labels. Experimental results demonstrate that our NRR-DD achieves state-of-the-art performance on both small-scale and large-scale datasets. Additionally, by storing only two distances per instance, our method achieves comparable results across various settings. Code will be available at https://github.com/tmtuan1307/NRR-DD.

Cite

Text

Tran et al. "Enhancing Dataset Distillation via Non-Critical Region Refinement." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.00936

Markdown

[Tran et al. "Enhancing Dataset Distillation via Non-Critical Region Refinement." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/tran2025cvpr-enhancing/) doi:10.1109/CVPR52734.2025.00936

BibTeX

@inproceedings{tran2025cvpr-enhancing,
  title     = {{Enhancing Dataset Distillation via Non-Critical Region Refinement}},
  author    = {Tran, Minh-Tuan and Le, Trung and Le, Xuan-May and Do, Thanh-Toan and Phung, Dinh},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2025},
  pages     = {10015-10024},
  doi       = {10.1109/CVPR52734.2025.00936},
  url       = {https://mlanthology.org/cvpr/2025/tran2025cvpr-enhancing/}
}