Weight-Rounding Error in Deep Neural Networks

Abstract

Current AI technologies based on deep neural networks (DNNs) are computationally extremely demanding, which limits their widespread deployment in embedded devices with constrained energy resources (e.g. battery-powered smartphones). One possible approach to solving this problem is to reduce the precision of weight parameters, which can save an enormous amount of energy for computation and data transfer at the cost of only a small loss in inference accuracy. In this paper, we provide a theoretical analysis of the effect of any weight rounding (e.g. reduced bitwidth) in a trained DNN on its output. We first derive a global upper bound on the output error of DNN (under the $L_1$ L 1 norm) caused by the weight rounding for all inputs from a bounded domain in the worst case, which turns out to be overestimated for practical use. We prove that computing this maximum error is NP-hard for a given weight rounding even for two layers, which follows from the NP-hardness of neuron state domains. Based on the concept of so-called shortcut weights, we propose a method called AppMax that estimates this error using linear programming on convex polytopes around test/training data points, which works for any approximation of DNN (e.g. including pruning). The AppMax method was extensively tested on fully connected and convolutional neural networks (trained on the MNIST database) for decreasing bitwidth of weights. The experiments demonstrate a clear improvement in the error guarantees provided by this method, which can be used to evaluate different approximation strategies and identify those that best balance accuracy and energy efficiency.

Cite

Text

Síma and Vidnerová. "Weight-Rounding Error in Deep Neural Networks." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2025. doi:10.1007/978-3-032-06078-5_23

Markdown

[Síma and Vidnerová. "Weight-Rounding Error in Deep Neural Networks." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2025.](https://mlanthology.org/ecmlpkdd/2025/sima2025ecmlpkdd-weightrounding/) doi:10.1007/978-3-032-06078-5_23

BibTeX

@inproceedings{sima2025ecmlpkdd-weightrounding,
  title     = {{Weight-Rounding Error in Deep Neural Networks}},
  author    = {Síma, Jirí and Vidnerová, Petra},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2025},
  pages     = {398-416},
  doi       = {10.1007/978-3-032-06078-5_23},
  url       = {https://mlanthology.org/ecmlpkdd/2025/sima2025ecmlpkdd-weightrounding/}
}