Hardware-Efficient Quantization for Green Custom Foundation Models

Abstract

We propose a new hardware-efficient quantization (HEQ) for low-power full-custom foundation models. The HEQ jointly optimizes multiplier hardware and weight quantization to minimize the total power consumption. Exploiting power profile of custom multipliers, our method achieves a significant power reduction up to 20 folds.

Cite

Text

Koike-Akino et al. "Hardware-Efficient Quantization for Green Custom Foundation Models." ICML 2024 Workshops: ES-FoMo-II, 2024.

Markdown

[Koike-Akino et al. "Hardware-Efficient Quantization for Green Custom Foundation Models." ICML 2024 Workshops: ES-FoMo-II, 2024.](https://mlanthology.org/icmlw/2024/koikeakino2024icmlw-hardwareefficient/)

BibTeX

@inproceedings{koikeakino2024icmlw-hardwareefficient,
  title     = {{Hardware-Efficient Quantization for Green Custom Foundation Models}},
  author    = {Koike-Akino, Toshiaki and Meng, Chang and Cevher, Volkan and De Micheli, Giovanni},
  booktitle = {ICML 2024 Workshops: ES-FoMo-II},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/koikeakino2024icmlw-hardwareefficient/}
}