Hardware-Efficient Quantization for Green Custom Foundation Models
Abstract
We propose a new hardware-efficient quantization (HEQ) for low-power full-custom foundation models. The HEQ jointly optimizes multiplier hardware and weight quantization to minimize the total power consumption. Exploiting power profile of custom multipliers, our method achieves a significant power reduction up to 20 folds.
Cite
Text
Koike-Akino et al. "Hardware-Efficient Quantization for Green Custom Foundation Models." ICML 2024 Workshops: ES-FoMo-II, 2024.Markdown
[Koike-Akino et al. "Hardware-Efficient Quantization for Green Custom Foundation Models." ICML 2024 Workshops: ES-FoMo-II, 2024.](https://mlanthology.org/icmlw/2024/koikeakino2024icmlw-hardwareefficient/)BibTeX
@inproceedings{koikeakino2024icmlw-hardwareefficient,
title = {{Hardware-Efficient Quantization for Green Custom Foundation Models}},
author = {Koike-Akino, Toshiaki and Meng, Chang and Cevher, Volkan and De Micheli, Giovanni},
booktitle = {ICML 2024 Workshops: ES-FoMo-II},
year = {2024},
url = {https://mlanthology.org/icmlw/2024/koikeakino2024icmlw-hardwareefficient/}
}