Not All Bits Have Equal Value: Heterogeneous Precisions via Trainable Noise

Abstract

We study the problem of training deep networks while quantizing parameters and activations into low-precision numeric representations, a setting central to reducing energy consumption and inference time of deployed models. We propose a method that learns different precisions, as measured by bits in numeric representations, for different weights in a neural network, yielding a heterogeneous allocation of bits across parameters. Learning precisions occurs alongside learning weight values, using a strategy derived from a novel framework wherein the intractability of optimizing discrete precisions is approximated by training per-parameter noise magnitudes. We broaden this framework to also encompass learning precisions for hidden state activations, simultaneously with weight precisions and values. Our approach exposes the objective of constructing a low-precision inference-efficient model to the entirety of the training process. Experiments show that it finds highly heterogeneous precision assignments for CNNs trained on CIFAR and ImageNet, improving upon previous state-of-the-art quantization methods. Our improvements extend to the challenging scenario of learning reduced-precision GANs.

Cite

Text

Savarese et al. "Not All Bits Have Equal Value: Heterogeneous Precisions via Trainable Noise." Neural Information Processing Systems, 2022.

Markdown

[Savarese et al. "Not All Bits Have Equal Value: Heterogeneous Precisions via Trainable Noise." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/savarese2022neurips-all/)

BibTeX

@inproceedings{savarese2022neurips-all,
  title     = {{Not All Bits Have Equal Value: Heterogeneous Precisions via Trainable Noise}},
  author    = {Savarese, Pedro and Yuan, Xin and Li, Yanjing and Maire, Michael},
  booktitle = {Neural Information Processing Systems},
  year      = {2022},
  url       = {https://mlanthology.org/neurips/2022/savarese2022neurips-all/}
}