Benchmarking the Reliability of Post-Training Quantization: A Particular Focus on Worst-Case Performance

Abstract

The reliability of post-training quantization (PTQ) methods in the face of extreme cases such as distribution shift and data noise remains largely unexplored, despite the popularity of PTQ as a method for compressing deep neural networks (DNNs) without altering their original architecture or training procedures. This paper conducts an investigation on commonly-used PTQ methods, addressing research questions pertaining to the impact of calibration set distribution variations, calibration paradigm selection, and data augmentation or sampling strategies on the reliability of PTQ. Through a systematic evaluation process encompassing various tasks and commonly-used PTQ paradigms, it is evident that the majority of existing PTQ methods lack the necessary reliability for worst-case group performance, underscoring the imperative for more robust approaches.

Cite

Text

Yuan et al. "Benchmarking the Reliability of Post-Training Quantization: A Particular Focus on Worst-Case Performance." ICML 2023 Workshops: AdvML-Frontiers, 2023.

Markdown

[Yuan et al. "Benchmarking the Reliability of Post-Training Quantization: A Particular Focus on Worst-Case Performance." ICML 2023 Workshops: AdvML-Frontiers, 2023.](https://mlanthology.org/icmlw/2023/yuan2023icmlw-benchmarking/)

BibTeX

@inproceedings{yuan2023icmlw-benchmarking,
  title     = {{Benchmarking the Reliability of Post-Training Quantization: A Particular Focus on Worst-Case Performance}},
  author    = {Yuan, Zhihang and Liu, Jiawei and Wu, Jiaxiang and Yang, Dawei and Wu, Qiang and Sun, Guangyu and Liu, Wenyu and Wang, Xinggang and Wu, Bingzhe},
  booktitle = {ICML 2023 Workshops: AdvML-Frontiers},
  year      = {2023},
  url       = {https://mlanthology.org/icmlw/2023/yuan2023icmlw-benchmarking/}
}