HADS: Hardware-Aware Deep Subnetworks

Abstract

We propose Hardware-Aware Deep Subnetworks (HADS) to tackle model adaptation to dynamic resource contraints. In contrast to the state-of-the-art, HADS use structured sparsity constructively by exploiting permutation invariance of neurons, which allows for hardware-specific optimizations. HADS achieve computational efficiency by skipping sequential computational blocks identified by a novel iterative knapsack optimizer. HADS support conventional deep networks frequently deployed on low-resource edge devices and provide computational benefits even for small and simple networks. We evaluate HADS on six benchmark architectures trained on the Google Speech Commands, Fashion-MNIST and CIFAR10 datasets, and test on four off-the-shelf mobile and embedded hardware platforms. We provide a theoretical result and empirical evidence for HADS outstanding performance in terms of submodels' test set accuracy, and demonstrate an adaptation time in response to dynamic resource constraints of under 40$\mu$s, utilizing a 2-layer fully-connected network on Arduino Nano 33 BLE Sense.

Cite

Text

Corti et al. "HADS: Hardware-Aware Deep Subnetworks." ICLR 2024 Workshops: PML4LRS, 2024.

Markdown

[Corti et al. "HADS: Hardware-Aware Deep Subnetworks." ICLR 2024 Workshops: PML4LRS, 2024.](https://mlanthology.org/iclrw/2024/corti2024iclrw-hads/)

BibTeX

@inproceedings{corti2024iclrw-hads,
  title     = {{HADS: Hardware-Aware Deep Subnetworks}},
  author    = {Corti, Francesco and Maag, Balz and Schauer, Joachim and Pferschy, Ulrich and Saukh, Olga},
  booktitle = {ICLR 2024 Workshops: PML4LRS},
  year      = {2024},
  url       = {https://mlanthology.org/iclrw/2024/corti2024iclrw-hads/}
}