Bi-Real Net: Enhancing the Performance of 1-Bit CNNs with Improved Representational Capability and Advanced Training Algorithm
Abstract
In this work, we study the 1-bit convolutional neural networks (CNNs), of which both the weights and activations are binary. While being efficient, the classification accuracy of the current 1-bit CNNs is much worse compared with their counterpart real-valued CNN models on the large-scale dataset, like ImageNet. To shrink the performance gap between the 1-bit and real-valued CNN models, we propose a novel model, dubbed Bi-Real net, which connects the real activations (after the 1-bit convolution and/or BatchNorm layer, before the sign function) to that of the consecutive block, through an identity shortcut. Consequently, compared to the standard 1-bit CNN, the representational capability of the Bi-Real net is significantly enhanced, only with a negligible additional cost on computation. Moreover, we develop a specific training algorithm including three technical novelties for 1-bit CNNs. First, we derive a tight approximation to the derivative of the non-differentiable sign function with respect to activation. Second, we propose a magnitude-aware gradient with respect to weight to update the weight parameter. Last, we pre-train the real-valued CNN model with a clip function, rather than the ReLU function, to provide a better initialization for Bi-Real net. Experiments on ImageNet show that the Bi-Real net with proposed training algorithm achieves 56.4% and 62.2% top-1 accuracy with 18 layers and 34 layers, respectively, and achieves up to 23.9X memory saving and 17.0X computational reduction.
Cite
Text
Liu et al. "Bi-Real Net: Enhancing the Performance of 1-Bit CNNs with Improved Representational Capability and Advanced Training Algorithm." Proceedings of the European Conference on Computer Vision (ECCV), 2018. doi:10.1007/978-3-030-01267-0_44Markdown
[Liu et al. "Bi-Real Net: Enhancing the Performance of 1-Bit CNNs with Improved Representational Capability and Advanced Training Algorithm." Proceedings of the European Conference on Computer Vision (ECCV), 2018.](https://mlanthology.org/eccv/2018/liu2018eccv-bireal/) doi:10.1007/978-3-030-01267-0_44BibTeX
@inproceedings{liu2018eccv-bireal,
title = {{Bi-Real Net: Enhancing the Performance of 1-Bit CNNs with Improved Representational Capability and Advanced Training Algorithm}},
author = {Liu, Zechun and Wu, Baoyuan and Luo, Wenhan and Yang, Xin and Liu, Wei and Cheng, Kwang-Ting},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2018},
doi = {10.1007/978-3-030-01267-0_44},
url = {https://mlanthology.org/eccv/2018/liu2018eccv-bireal/}
}