A Novel Differentiable Mixed-Precision Quantization Search Framework for Alleviating the Matthew Effect and Improving Robustness

Abstract

Network quantization is an effective and widely-used model compression technique. Recently, several works apply differentiable neural architectural search (NAS) methods to mixed-precision quantization (MPQ) and achieve encouraging results. However, the nature of differentiable architecture search can lead to the Matthew Effect in the mixed-precision. The candidates with higher bit-widths would be trained maturely earlier while the candidates with lower bit-widths may never have the chance to express the desired function. To address this issue, we propose a novel mixed-precision quantization framework. The mixed-precision search is resolved as a distribution learning problem, which alleviates the Matthew effect and improves the generalization ability. Meanwhile, different from generic differentiable NAS methods, search space will grow rapidly as the depth of the network increases in the mixed-precision quantization search. This makes the supernet harder to train and the search process unstable. To this end, we add a skip connection with a gradually decreasing architecture weight between convolutional layers in the supernet to improve robustness. The skip connection will help the optimization of the search process and will not participate in the bit width competition. Extensive experiments on CIFAR-10 and ImageNet demonstrate the effectiveness of the proposed methods. For example, when quantizing ResNet-50 on ImageNet, we achieve a state-of-the-art 156.10x Bitops compression rate while maintaining a 75.87$%$ accuracy.

Cite

Text

Zhou et al. "A Novel Differentiable Mixed-Precision Quantization Search Framework for Alleviating the Matthew Effect and Improving Robustness." Proceedings of The 14th Asian Conference on Machine Learning, 2022.

Markdown

[Zhou et al. "A Novel Differentiable Mixed-Precision Quantization Search Framework for Alleviating the Matthew Effect and Improving Robustness." Proceedings of The 14th Asian Conference on Machine Learning, 2022.](https://mlanthology.org/acml/2022/zhou2022acml-novel/)

BibTeX

@inproceedings{zhou2022acml-novel,
  title     = {{A Novel Differentiable Mixed-Precision Quantization Search Framework for Alleviating the Matthew Effect and Improving Robustness}},
  author    = {Zhou, Hengyi and He, Hongyi and Liu, Wanchen and Li, Yuhai and Zhang, Haonan and Liu, Longjun},
  booktitle = {Proceedings of The 14th Asian Conference on Machine Learning},
  year      = {2022},
  pages     = {1277-1292},
  volume    = {189},
  url       = {https://mlanthology.org/acml/2022/zhou2022acml-novel/}
}