NeuralFuse: Learning to Recover the Accuracy of Access-Limited Neural Network Inference in Low-Voltage Regimes
Abstract
Deep neural networks (DNNs) have become ubiquitous in machine learning, but their energy consumption remains problematically high. An effective strategy for reducing such consumption is supply-voltage reduction, but if done too aggressively, it can lead to accuracy degradation. This is due to random bit-flips in static random access memory (SRAM), where model parameters are stored. To address this challenge, we have developed NeuralFuse, a novel add-on module that handles the energy-accuracy tradeoff in low-voltage regimes by learning input transformations and using them to generate error-resistant data representations, thereby protecting DNN accuracy in both nominal and low-voltage scenarios. As well as being easy to implement, NeuralFuse can be readily applied to DNNs with limited access, such cloud-based APIs that are accessed remotely or non-configurable hardware. Our experimental results demonstrate that, at a 1% bit-error rate, NeuralFuse can reduce SRAM access energy by up to 24% while recovering accuracy by up to 57%. To the best of our knowledge, this is the first approach to addressing low-voltage-induced bit errors that requires no model retraining.
Cite
Text
Sun et al. "NeuralFuse: Learning to Recover the Accuracy of Access-Limited Neural Network Inference in Low-Voltage Regimes." Neural Information Processing Systems, 2024. doi:10.52202/079017-1729Markdown
[Sun et al. "NeuralFuse: Learning to Recover the Accuracy of Access-Limited Neural Network Inference in Low-Voltage Regimes." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/sun2024neurips-neuralfuse/) doi:10.52202/079017-1729BibTeX
@inproceedings{sun2024neurips-neuralfuse,
title = {{NeuralFuse: Learning to Recover the Accuracy of Access-Limited Neural Network Inference in Low-Voltage Regimes}},
author = {Sun, Hao-Lun and Hsiung, Lei and Chandramoorthy, Nandhini and Chen, Pin-Yu and Ho, Tsung-Yi},
booktitle = {Neural Information Processing Systems},
year = {2024},
doi = {10.52202/079017-1729},
url = {https://mlanthology.org/neurips/2024/sun2024neurips-neuralfuse/}
}