Differentiable Soft Min-Max Loss to Restrict Weight Range for Model Quantization
Abstract
The range of weights in a model disrupts effective lower bit quantization. Penalizing the range of weights improve quantization accuracy, but calculation of range (max-min) is not differentiable. In this work, we propose Differentiable Soft Min-Max Loss (DSMM) to restrict weight ranges so that we can get a quantization-friendly model which has narrow weight ranges. We apply DSMM with a learnable parameter which can adjust hardness of DSMM without requiring a special hyper-parameter. DSMM improves lower bit quantization accuracy with state-of-the-art post-training quantization (PTQ), quantization-aware training (QAT), and weight clustering across various domains and model sizes.
Cite
Text
Kundu et al. "Differentiable Soft Min-Max Loss to Restrict Weight Range for Model Quantization." ICML 2024 Workshops: Differentiable_Almost_Everything, 2024.Markdown
[Kundu et al. "Differentiable Soft Min-Max Loss to Restrict Weight Range for Model Quantization." ICML 2024 Workshops: Differentiable_Almost_Everything, 2024.](https://mlanthology.org/icmlw/2024/kundu2024icmlw-differentiable/)BibTeX
@inproceedings{kundu2024icmlw-differentiable,
title = {{Differentiable Soft Min-Max Loss to Restrict Weight Range for Model Quantization}},
author = {Kundu, Arnav and Yoo, Chungkuk and Cho, Minsik and Adya, Saurabh},
booktitle = {ICML 2024 Workshops: Differentiable_Almost_Everything},
year = {2024},
url = {https://mlanthology.org/icmlw/2024/kundu2024icmlw-differentiable/}
}