Scalable Model Compression by Entropy Penalized Reparameterization

Abstract

We describe a simple and general neural network weight compression approach, in which the network parameters (weights and biases) are represented in a “latent” space, amounting to a reparameterization. This space is equipped with a learned probability model, which is used to impose an entropy penalty on the parameter representation during training, and to compress the representation using a simple arithmetic coder after training. Classification accuracy and model compressibility is maximized jointly, with the bitrate–accuracy trade-off specified by a hyperparameter. We evaluate the method on the MNIST, CIFAR-10 and ImageNet classification benchmarks using six distinct model architectures. Our results show that state-of-the-art model compression can be achieved in a scalable and general way without requiring complex procedures such as multi-stage training.

Cite

Text

Oktay et al. "Scalable Model Compression by Entropy Penalized Reparameterization." International Conference on Learning Representations, 2020.

Markdown

[Oktay et al. "Scalable Model Compression by Entropy Penalized Reparameterization." International Conference on Learning Representations, 2020.](https://mlanthology.org/iclr/2020/oktay2020iclr-scalable/)

BibTeX

@inproceedings{oktay2020iclr-scalable,
  title     = {{Scalable Model Compression by Entropy Penalized Reparameterization}},
  author    = {Oktay, Deniz and Ballé, Johannes and Singh, Saurabh and Shrivastava, Abhinav},
  booktitle = {International Conference on Learning Representations},
  year      = {2020},
  url       = {https://mlanthology.org/iclr/2020/oktay2020iclr-scalable/}
}