A Diagonal State Space Model on Loihi 2 for Efficient Streaming Sequence Processing

Abstract

Deep State Space Models (SSM) demonstrate state-of-the-art performance on long-range sequence modeling tasks. While the recurrent structure of SSMs can be efficiently implemented as a convolution or as a parallel scan during training, recurrent token-by-token processing cannot currently be implemented efficiently on GPUs. Here, we demonstrate efficient token-by-token inference of the SSM S4D on Intel’s Loihi 2 state-of-the-art neuromorphic processor. We compare this first-ever neuromorphic-hardware implementation of an SSM on sMNIST, psMNIST, and sCIFAR to a recurrent and a convolutional implementation of S4D on Jetson Orin Nano (Jetson). While we find Jetson to perform better in an offline sample-by-sample based batched processing mode, Loihi 2 outperforms during token-by-token based processing, where it consumes 1000 times less energy with a 75 times lower latency and a 75 times higher throughput compared to the recurrent implementation of S4D on Jetson. This opens up new avenues towards efficient real-time streaming applications of SSMs.

Cite

Text

Meyer et al. "A Diagonal State Space Model on Loihi 2 for Efficient Streaming Sequence Processing." NeurIPS 2024 Workshops: MLNCP, 2024.

Markdown

[Meyer et al. "A Diagonal State Space Model on Loihi 2 for Efficient Streaming Sequence Processing." NeurIPS 2024 Workshops: MLNCP, 2024.](https://mlanthology.org/neuripsw/2024/meyer2024neuripsw-diagonal/)

BibTeX

@inproceedings{meyer2024neuripsw-diagonal,
  title     = {{A Diagonal State Space Model on Loihi 2 for Efficient Streaming Sequence Processing}},
  author    = {Meyer, Svea Marie and Weidel, Philipp and Philipp, Plank and Campos-Macias, Leobardo and Shrestha, Sumit Bam and Stratmann, Philipp and Timcheck, Jonathan and Richter, Mathis},
  booktitle = {NeurIPS 2024 Workshops: MLNCP},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/meyer2024neuripsw-diagonal/}
}