Accurate Neural Network Pruning Requires Rethinking Sparse Optimization
Abstract
Obtaining versions of deep neural networks that are both highly-accurate and highly-sparse % is one of the main challenges in the area of model compression, and several high-performance pruning techniques have been investigated by the community. Yet, much less is known about the interaction between sparsity and the standard stochastic optimization techniques used for training sparse networks, and most existing work uses standard dense schedules and hyperparameters for training sparse networks. In this work, we examine the impact of high sparsity on model training using the standard computer vision and natural language processing sparsity benchmarks. We begin by showing that using standard dense training recipes for sparse training is suboptimal, and provide evidence that this results in *under-training*, loosely defined as using a suboptimal number of passes over the training data. We present training recipes for mitigating this issue for both sparse pre-training of vision models (e.g. ResNet50/ImageNet) and sparse fine-tuning of language models (e.g. BERT/GLUE), achieving state-of-the-art results in both settings in the high-sparsity regime, and providing detailed analyses for the difficulty of sparse training in both scenarios. Our work sets a new benchmark in terms of the accuracies that can be achieved under high sparsity, and should inspire further research into improving sparse model training, to reach higher accuracies under high sparsity, but also to do so efficiently.
Cite
Text
Kuznedelev et al. "Accurate Neural Network Pruning Requires Rethinking Sparse Optimization." Transactions on Machine Learning Research, 2024.Markdown
[Kuznedelev et al. "Accurate Neural Network Pruning Requires Rethinking Sparse Optimization." Transactions on Machine Learning Research, 2024.](https://mlanthology.org/tmlr/2024/kuznedelev2024tmlr-accurate/)BibTeX
@article{kuznedelev2024tmlr-accurate,
title = {{Accurate Neural Network Pruning Requires Rethinking Sparse Optimization}},
author = {Kuznedelev, Denis and Kurtic, Eldar and Iofinova, Eugenia and Frantar, Elias and Peste, Alexandra and Alistarh, Dan},
journal = {Transactions on Machine Learning Research},
year = {2024},
url = {https://mlanthology.org/tmlr/2024/kuznedelev2024tmlr-accurate/}
}