Optimization Based Layer-Wise Magnitude-Based Pruning for DNN Compression
Abstract
Layer-wise magnitude-based pruning (LMP) is a very popular method for deep neural network (DNN) compression. However, tuning the layer-specific thresholds is a difficult task, since the space of threshold candidates is exponentially large and the evaluation is very expensive. Previous methods are mainly by hand and require expertise. In this paper, we propose an automatic tuning approach based on optimization, named OLMP. The idea is to transform the threshold tuning problem into a constrained optimization problem (i.e., minimizing the size of the pruned model subject to a constraint on the accuracy loss), and then use powerful derivative-free optimization algorithms to solve it. To compress a trained DNN, OLMP is conducted within a new iterative pruning and adjusting pipeline. Empirical results show that OLMP can achieve the best pruning ratio on LeNet-style models (i.e., 114 times for LeNet-300-100 and 298 times for LeNet-5) compared with some state-of-the- art DNN pruning methods, and can reduce the size of an AlexNet-style network up to 82 times without accuracy loss.
Cite
Text
Li et al. "Optimization Based Layer-Wise Magnitude-Based Pruning for DNN Compression." International Joint Conference on Artificial Intelligence, 2018. doi:10.24963/IJCAI.2018/330Markdown
[Li et al. "Optimization Based Layer-Wise Magnitude-Based Pruning for DNN Compression." International Joint Conference on Artificial Intelligence, 2018.](https://mlanthology.org/ijcai/2018/li2018ijcai-optimization/) doi:10.24963/IJCAI.2018/330BibTeX
@inproceedings{li2018ijcai-optimization,
title = {{Optimization Based Layer-Wise Magnitude-Based Pruning for DNN Compression}},
author = {Li, Guiying and Qian, Chao and Jiang, Chunhui and Lu, Xiaofen and Tang, Ke},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2018},
pages = {2383-2389},
doi = {10.24963/IJCAI.2018/330},
url = {https://mlanthology.org/ijcai/2018/li2018ijcai-optimization/}
}