Spike2Former: Efficient Spiking Transformer for High-Performance Image Segmentation

Abstract

Spiking Neural Networks (SNNs) have a low-power advantage but perform poorly in image segmentation tasks. The reason is that directly converting neural networks with complex architectural designs for segmentation tasks into spiking versions leads to performance degradation and non-convergence. To address this challenge, we first identify the modules in the architecture design that lead to the severe reduction in spike firing, make targeted improvements, and propose Spike2Former architecture. Second, we propose normalized integer spiking neurons to solve the training stability problem of SNNs with complex architectures. We set a new state-of-the-art for SNNs in various semantic segmentation datasets, with a significant improvement of +12.7% mIoU and 5.0x efficiency on ADE20K, +14.3% mIoU and 5.2x efficiency on VOC2012, and +9.1% mIoU and 6.6x efficiency on CityScapes.

Cite

Text

Lei et al. "Spike2Former: Efficient Spiking Transformer for High-Performance Image Segmentation." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I2.32126

Markdown

[Lei et al. "Spike2Former: Efficient Spiking Transformer for High-Performance Image Segmentation." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/lei2025aaai-spike/) doi:10.1609/AAAI.V39I2.32126

BibTeX

@inproceedings{lei2025aaai-spike,
  title     = {{Spike2Former: Efficient Spiking Transformer for High-Performance Image Segmentation}},
  author    = {Lei, Zhenxin and Yao, Man and Hu, Jiakui and Luo, Xinhao and Lu, Yanye and Xu, Bo and Li, Guoqi},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {1364-1372},
  doi       = {10.1609/AAAI.V39I2.32126},
  url       = {https://mlanthology.org/aaai/2025/lei2025aaai-spike/}
}