TRQ: Ternary Neural Networks with Residual Quantization
Abstract
Ternary neural networks (TNNs) are potential for network acceleration by reducing the full-precision weights in network to ternary ones, e.g., -1,0,1. However, existing TNNs are mostly calculated based on rule-of-thumb quantization methods by simply thresholding operations, which causes a significant accuracy loss. In this paper, we introduce a stem-residual framework which provides new insight into Ternary quantization, termed Residual Quantization (TRQ), to achieve more powerful TNNs. Rather than directly thresholding operations, TRQ recursively performs quantization on full-precision weights for a refined reconstruction by combining the binarized stem and residual parts. With such a unique quantization process, TRQ endows the quantizer with high flexibility and precision. Our TRQ is generic, which can be easily extended to multiple bits through recursively encoded residual for a better recognition accuracy. Extensive experimental results demonstrate that the proposed method yields great recognition accuracy while being accelerated.
Cite
Text
Li et al. "TRQ: Ternary Neural Networks with Residual Quantization." AAAI Conference on Artificial Intelligence, 2021. doi:10.1609/AAAI.V35I10.17036Markdown
[Li et al. "TRQ: Ternary Neural Networks with Residual Quantization." AAAI Conference on Artificial Intelligence, 2021.](https://mlanthology.org/aaai/2021/li2021aaai-trq/) doi:10.1609/AAAI.V35I10.17036BibTeX
@inproceedings{li2021aaai-trq,
title = {{TRQ: Ternary Neural Networks with Residual Quantization}},
author = {Li, Yue and Ding, Wenrui and Liu, Chunlei and Zhang, Baochang and Guo, Guodong},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2021},
pages = {8538-8546},
doi = {10.1609/AAAI.V35I10.17036},
url = {https://mlanthology.org/aaai/2021/li2021aaai-trq/}
}