ProxQuant: Quantized Neural Networks via Proximal Operators

ICLR 2019

/iclr/2019/bai2019iclr-proxquant/

Abstract

To make deep neural networks feasible in resource-constrained environments (such as mobile devices), it is beneficial to quantize models by using low-precision weights. One common technique for quantizing neural networks is the straight-through gradient method, which enables back-propagation through the quantization mapping. Despite its empirical success, little is understood about why the straight-through gradient method works. Building upon a novel observation that the straight-through gradient method is in fact identical to the well-known Nesterov’s dual-averaging algorithm on a quantization constrained optimization problem, we propose a more principled alternative approach, called ProxQuant , that formulates quantized network training as a regularized learning problem instead and optimizes it via the prox-gradient method. ProxQuant does back-propagation on the underlying full-precision vector and applies an efficient prox-operator in between stochastic gradient steps to encourage quantizedness. For quantizing ResNets and LSTMs, ProxQuant outperforms state-of-the-art results on binary quantization and is on par with state-of-the-art on multi-bit quantization. We further perform theoretical analyses showing that ProxQuant converges to stationary points under mild smoothness assumptions, whereas variants such as lazy prox-gradient method can fail to converge in the same setting.

PDF ICLR Code Semantic Scholar

Cite

Text

Bai et al. "ProxQuant: Quantized Neural Networks via Proximal Operators." International Conference on Learning Representations, 2019.

Markdown

[Bai et al. "ProxQuant: Quantized Neural Networks via Proximal Operators." International Conference on Learning Representations, 2019.](https://mlanthology.org/iclr/2019/bai2019iclr-proxquant/)

BibTeX

@inproceedings{bai2019iclr-proxquant,
  title     = {{ProxQuant: Quantized Neural Networks via Proximal Operators}},
  author    = {Bai, Yu and Wang, Yu-Xiang and Liberty, Edo},
  booktitle = {International Conference on Learning Representations},
  year      = {2019},
  url       = {https://mlanthology.org/iclr/2019/bai2019iclr-proxquant/}
}