SYQ: Learning Symmetric Quantization for Efficient Deep Neural Networks

Abstract

Inference for state-of-the-art deep neural networks is computationally expensive, making them difficult to deploy on constrained hardware environments. An efficient way to reduce this complexity is to quantize the weight parameters and/or activations during training by approximating their distributions with a limited entry codebook. For very low-precisions, such as binary or ternary networks with 1-8-bit activations, the information loss from quantization leads to significant accuracy degradation due to large gradient mismatches between the forward and backward functions. In this paper, we introduce a quantization method to reduce this loss by learning a symmetric codebook for particular weight subgroups. These subgroups are determined based on their locality in the weight matrix, such that the hardware simplicity of the low-precision representations is preserved. Empirically, we show that symmetric quantization can substantially improve accuracy for networks with extremely low-precision weights and activations. We also demonstrate that this representation imposes minimal or no hardware implications to more coarse-grained approaches. Source code is available at https://www.github.com/julianfaraone/SYQ.

Cite

Text

Faraone et al. "SYQ: Learning Symmetric Quantization for Efficient Deep Neural Networks." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018. doi:10.1109/CVPR.2018.00452

Markdown

[Faraone et al. "SYQ: Learning Symmetric Quantization for Efficient Deep Neural Networks." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018.](https://mlanthology.org/cvpr/2018/faraone2018cvpr-syq/) doi:10.1109/CVPR.2018.00452

BibTeX

@inproceedings{faraone2018cvpr-syq,
  title     = {{SYQ: Learning Symmetric Quantization for Efficient Deep Neural Networks}},
  author    = {Faraone, Julian and Fraser, Nicholas and Blott, Michaela and Leong, Philip H.W.},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2018},
  doi       = {10.1109/CVPR.2018.00452},
  url       = {https://mlanthology.org/cvpr/2018/faraone2018cvpr-syq/}
}