TRQ: Ternary Neural Networks with Residual Quantization

Abstract

Ternary neural networks (TNNs) are potential for network acceleration by reducing the full-precision weights in network to ternary ones, e.g., -1,0,1. However, existing TNNs are mostly calculated based on rule-of-thumb quantization methods by simply thresholding operations, which causes a significant accuracy loss. In this paper, we introduce a stem-residual framework which provides new insight into Ternary quantization, termed Residual Quantization (TRQ), to achieve more powerful TNNs. Rather than directly thresholding operations, TRQ recursively performs quantization on full-precision weights for a refined reconstruction by combining the binarized stem and residual parts. With such a unique quantization process, TRQ endows the quantizer with high flexibility and precision. Our TRQ is generic, which can be easily extended to multiple bits through recursively encoded residual for a better recognition accuracy. Extensive experimental results demonstrate that the proposed method yields great recognition accuracy while being accelerated.

Cite

Text

Li et al. "TRQ: Ternary Neural Networks with Residual Quantization." AAAI Conference on Artificial Intelligence, 2021. doi:10.1609/AAAI.V35I10.17036

Markdown

[Li et al. "TRQ: Ternary Neural Networks with Residual Quantization." AAAI Conference on Artificial Intelligence, 2021.](https://mlanthology.org/aaai/2021/li2021aaai-trq/) doi:10.1609/AAAI.V35I10.17036

BibTeX

@inproceedings{li2021aaai-trq,
  title     = {{TRQ: Ternary Neural Networks with Residual Quantization}},
  author    = {Li, Yue and Ding, Wenrui and Liu, Chunlei and Zhang, Baochang and Guo, Guodong},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2021},
  pages     = {8538-8546},
  doi       = {10.1609/AAAI.V35I10.17036},
  url       = {https://mlanthology.org/aaai/2021/li2021aaai-trq/}
}