Activity Sparsity Complements Weight Sparsity for Efficient RNN Inference

Abstract

Artificial neural networks open up unprecedented machine learning capabilities at the cost of seemingly ever growing computational requirements. Concurrently, the field of neuromorphic computing develops biologically inspired spiking neural networks and hardware platforms with the goal of bridging the efficiency-gap between biological brains and deep learning systems. Yet, spiking neural networks often times fall behind deep learning systems on many machine learning tasks. In this work, we demonstrate that the reduction factor of sparsely activated recurrent neural networks multiplies with the reduction factor of sparse weights. Our model achieves up to $20\times$ reduction of operations while maintaining perplexities below $60$ on the Penn Treebank language modeling task. This reduction factor has not be achieved with solely sparsely connected LSTMs, and the language modeling performance of our model has not been achieved with sparsely activated spiking neural networks. Our results suggest to further drive convergence of methods from deep learning and neuromorphic computing for efficient machine learning.

Cite

Text

Mukherji et al. "Activity Sparsity Complements Weight Sparsity for Efficient RNN Inference." NeurIPS 2023 Workshops: MLNCP, 2023.

Markdown

[Mukherji et al. "Activity Sparsity Complements Weight Sparsity for Efficient RNN Inference." NeurIPS 2023 Workshops: MLNCP, 2023.](https://mlanthology.org/neuripsw/2023/mukherji2023neuripsw-activity/)

BibTeX

@inproceedings{mukherji2023neuripsw-activity,
  title     = {{Activity Sparsity Complements Weight Sparsity for Efficient RNN Inference}},
  author    = {Mukherji, Rishav and Schöne, Mark and Nazeer, Khaleelulla Khan and Mayr, Christian and Subramoney, Anand},
  booktitle = {NeurIPS 2023 Workshops: MLNCP},
  year      = {2023},
  url       = {https://mlanthology.org/neuripsw/2023/mukherji2023neuripsw-activity/}
}