Probabilistic Weight Fixing: Large-Scale Training of Neural Network Weight Uncertainties for Quantisation.

Abstract

Weight-sharing quantization has emerged as a technique to reduce energy expenditure during inference in large neural networks by constraining their weights to a limited set of values. However, existing methods often assume weights are treated solely based on value, neglecting the unique role of weight position. This paper proposes a probabilistic framework based on Bayesian neural networks (BNNs) and a variational relaxation to identify which weights can be moved to which cluster center and to what degree based on their individual position-specific learned uncertainty distributions. We introduce a new initialization setting and a regularization term, enabling the training of BNNs with complex dataset-model combinations. Leveraging the flexibility of weight values from probability distributions, we enhance noise resilience and compressibility. Our iterative clustering procedure demonstrates superior compressibility and higher accuracy compared to state-of-the-art methods on both ResNet models and the more complex transformer-based architectures. In particular, our method outperforms the state-of-the-art quantization method top-1 accuracy by 1.6\% on ImageNet using DeiT-Tiny, with its 5 million+ weights now represented by only 296 unique values. Code available at https://github.com/subiawaud/PWFN.

Cite

Text

Subia-Waud and Dasmahapatra. "Probabilistic Weight Fixing: Large-Scale Training of Neural Network Weight Uncertainties for Quantisation.." Neural Information Processing Systems, 2023.

Markdown

[Subia-Waud and Dasmahapatra. "Probabilistic Weight Fixing: Large-Scale Training of Neural Network Weight Uncertainties for Quantisation.." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/subiawaud2023neurips-probabilistic/)

BibTeX

@inproceedings{subiawaud2023neurips-probabilistic,
  title     = {{Probabilistic Weight Fixing: Large-Scale Training of Neural Network Weight Uncertainties for Quantisation.}},
  author    = {Subia-Waud, Chris and Dasmahapatra, Srinandan},
  booktitle = {Neural Information Processing Systems},
  year      = {2023},
  url       = {https://mlanthology.org/neurips/2023/subiawaud2023neurips-probabilistic/}
}