Not All Bits Have Equal Value: Heterogeneous Precisions via Trainable Noise
Abstract
We study the problem of training deep networks while quantizing parameters and activations into low-precision numeric representations, a setting central to reducing energy consumption and inference time of deployed models. We propose a method that learns different precisions, as measured by bits in numeric representations, for different weights in a neural network, yielding a heterogeneous allocation of bits across parameters. Learning precisions occurs alongside learning weight values, using a strategy derived from a novel framework wherein the intractability of optimizing discrete precisions is approximated by training per-parameter noise magnitudes. We broaden this framework to also encompass learning precisions for hidden state activations, simultaneously with weight precisions and values. Our approach exposes the objective of constructing a low-precision inference-efficient model to the entirety of the training process. Experiments show that it finds highly heterogeneous precision assignments for CNNs trained on CIFAR and ImageNet, improving upon previous state-of-the-art quantization methods. Our improvements extend to the challenging scenario of learning reduced-precision GANs.
Cite
Text
Savarese et al. "Not All Bits Have Equal Value: Heterogeneous Precisions via Trainable Noise." Neural Information Processing Systems, 2022.Markdown
[Savarese et al. "Not All Bits Have Equal Value: Heterogeneous Precisions via Trainable Noise." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/savarese2022neurips-all/)BibTeX
@inproceedings{savarese2022neurips-all,
title = {{Not All Bits Have Equal Value: Heterogeneous Precisions via Trainable Noise}},
author = {Savarese, Pedro and Yuan, Xin and Li, Yanjing and Maire, Michael},
booktitle = {Neural Information Processing Systems},
year = {2022},
url = {https://mlanthology.org/neurips/2022/savarese2022neurips-all/}
}