Accelerating CNN Training by Pruning Activation Gradients
Abstract
Sparsification is an efficient approach to accelerate CNN inference, but it is challenging to take advantage of sparsity in training procedure because the involved gradients are dynamically changed. Actually, an important observation shows that most of the activation gradients in back-propagation are very close to zero and only have a tiny impact on weight-updating. Hence, we consider pruning these very small gradients randomly to accelerate CNN training according to the statistical distribution of activation gradients. Meanwhile, we theoretically analyze the impact of pruning algorithm on the convergence. The proposed approach is evaluated on AlexNet and ResNet-\{18, 34, 50, 101, 152\} with CIFAR-\{10, 100\} and ImageNet datasets. Experimental results show that our training approach could substantially achieve up to $5.92 imes$ speedups at back-propagation stage with negligible accuracy loss.
Cite
Text
Ye et al. "Accelerating CNN Training by Pruning Activation Gradients." Proceedings of the European Conference on Computer Vision (ECCV), 2020. doi:10.1007/978-3-030-58595-2_20Markdown
[Ye et al. "Accelerating CNN Training by Pruning Activation Gradients." Proceedings of the European Conference on Computer Vision (ECCV), 2020.](https://mlanthology.org/eccv/2020/ye2020eccv-accelerating/) doi:10.1007/978-3-030-58595-2_20BibTeX
@inproceedings{ye2020eccv-accelerating,
title = {{Accelerating CNN Training by Pruning Activation Gradients}},
author = {Ye, Xucheng and Dai, Pengcheng and Luo, Junyu and Guo, Xin and Qi, Yingjie and Yang, Jianlei and Chen, Yiran},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2020},
doi = {10.1007/978-3-030-58595-2_20},
url = {https://mlanthology.org/eccv/2020/ye2020eccv-accelerating/}
}