Data-Free Model Pruning at Initialization via Expanders

Abstract

In light of the enormous computational resources required to store and train modern deep learning models, significant research has focused on model compression. When deploying compressed networks on remote devices prior to training them, a compression scheme cannot use any training data or derived information (e.g., gradients). This leaves only the structure of the network to work with, and existing literature on how graph structure affects network performance is scarce. Recently, expander graphs have been put forward as a tool for sparsifying neural architectures. Unfortunately, however, existing models can rarely outperform a naïve random baseline. In this work, we propose a stronger model for generating expanders, which we then use to sparsify a variety of mainstream CNN architectures. We demonstrate that accuracy is an increasing function of expansion in a sparse model, and both analyse and elucidate its superior performance over alternative models.

Cite

Text

Stewart et al. "Data-Free Model Pruning at Initialization via Expanders." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2023. doi:10.1109/CVPRW59228.2023.00475

Markdown

[Stewart et al. "Data-Free Model Pruning at Initialization via Expanders." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2023.](https://mlanthology.org/cvprw/2023/stewart2023cvprw-datafree/) doi:10.1109/CVPRW59228.2023.00475

BibTeX

@inproceedings{stewart2023cvprw-datafree,
  title     = {{Data-Free Model Pruning at Initialization via Expanders}},
  author    = {Stewart, James and Michieli, Umberto and Ozay, Mete},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2023},
  pages     = {4519-4524},
  doi       = {10.1109/CVPRW59228.2023.00475},
  url       = {https://mlanthology.org/cvprw/2023/stewart2023cvprw-datafree/}
}