Dataset Distillation by Automatic Training Trajectories
Abstract
Dataset Distillation is used to create a concise, yet informative, synthetic dataset that can replace the original dataset for training purposes. Some leading methods in this domain prioritize long-range matching, involving the unrolling of training trajectories with a fixed number of steps (NS ) on the synthetic dataset to align with various expert training trajectories. However, traditional long-range matching methods possess an overfitting-like problem, the fixed step size NS forces synthetic dataset to distortedly conform seen expert training trajectories, resulting in a loss of generality—especially to those from unencountered architecture. We refer to this as the Accumulated Mismatching Problem (AMP), and propose a new approach, Automatic Training Trajectories (ATT), which dynamically and adaptively adjusts trajectory length NS to address the AMP. Our method outperforms existing methods particularly in tests involving cross-architectures. Moreover, owing to its adaptive nature, it exhibits enhanced stability in the face of parameter variations. Our source code is publicly available at https: //github.com/NiaLiu/ATT
Cite
Text
Liu et al. "Dataset Distillation by Automatic Training Trajectories." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-73021-4_20Markdown
[Liu et al. "Dataset Distillation by Automatic Training Trajectories." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/liu2024eccv-dataset/) doi:10.1007/978-3-031-73021-4_20BibTeX
@inproceedings{liu2024eccv-dataset,
title = {{Dataset Distillation by Automatic Training Trajectories}},
author = {Liu, Dai and Gu, Jindong and Cao, Hu and Trinitis, Carsten and Schulz, Martin},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2024},
doi = {10.1007/978-3-031-73021-4_20},
url = {https://mlanthology.org/eccv/2024/liu2024eccv-dataset/}
}