How Much Pre-Training Is Enough to Discover a Good Subnetwork?
Abstract
Neural network pruning helps discover efficient, high-performing subnetworks within pre-trained, dense network architectures. More often than not, it involves a three-step process—pre-training, pruning, and re-training—that is computationally expensive, as the dense model must be fully pre-trained. While previous work has revealed through experiments the relationship between the amount of pre-training and the performance of the pruned network, a theoretical characterization of such dependency is still missing. Aiming to mathematically analyze the amount of dense network pre-training needed for a pruned network to perform well, we discover a simple theoretical bound in the number of gradient descent pre-training iterations on a two-layer fully connected network in the NTK regime, beyond which pruning via greedy forward selection \citep{provable_subnetworks} yields a subnetwork that achieves good training error. Interestingly, this threshold is logarithmically dependent upon the size of the dataset, meaning that experiments with larger datasets require more pre-training for subnetworks obtained via pruning to perform well. Lastly, we empirically validate our theoretical results on multi-layer perceptions and residual-based convolutional networks trained on MNIST, CIFAR, and ImageNet datasets.
Cite
Text
Wolfe et al. "How Much Pre-Training Is Enough to Discover a Good Subnetwork?." Transactions on Machine Learning Research, 2024.Markdown
[Wolfe et al. "How Much Pre-Training Is Enough to Discover a Good Subnetwork?." Transactions on Machine Learning Research, 2024.](https://mlanthology.org/tmlr/2024/wolfe2024tmlr-much/)BibTeX
@article{wolfe2024tmlr-much,
title = {{How Much Pre-Training Is Enough to Discover a Good Subnetwork?}},
author = {Wolfe, Cameron R. and Liao, Fangshuo and Wang, Qihan and Kim, Junhyung Lyle and Kyrillidis, Anastasios},
journal = {Transactions on Machine Learning Research},
year = {2024},
url = {https://mlanthology.org/tmlr/2024/wolfe2024tmlr-much/}
}