Hybrid Spiking Vision Transformer for Object Detection with Event Cameras
Abstract
Event-based object detection has attracted increasing attention for its high temporal resolution, wide dynamic range, and asynchronous address-event representation. Leveraging these advantages, spiking neural networks (SNNs) have emerged as a promising approach, offering low energy consumption and rich spatiotemporal dynamics. To further enhance the performance of event-based object detection, this study proposes a novel hybrid spike vision Transformer (HsVT) model. The HsVT model integrates a spatial feature extraction module to capture local and global features, and a temporal feature extraction module to model time dependencies and long-term patterns in event sequences. This combination enables HsVT to capture spatiotemporal features, improving its capability in handling complex event-based object detection tasks. To support research in this area, we developed the Fall Detection dataset as a benchmark for event-based object detection tasks. The Fall DVS detection dataset protects facial privacy and reduces memory usage thanks to its event-based representation. Experimental results demonstrate that HsVT outperforms existing SNN methods and achieves competitive performance compared to ANN-based models, with fewer parameters and lower energy consumption.
Cite
Text
Xu et al. "Hybrid Spiking Vision Transformer for Object Detection with Event Cameras." Proceedings of the 42nd International Conference on Machine Learning, 2025.Markdown
[Xu et al. "Hybrid Spiking Vision Transformer for Object Detection with Event Cameras." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/xu2025icml-hybrid/)BibTeX
@inproceedings{xu2025icml-hybrid,
title = {{Hybrid Spiking Vision Transformer for Object Detection with Event Cameras}},
author = {Xu, Qi and Deng, Jie and Shen, Jiangrong and Chen, Biwu and Tang, Huajin and Pan, Gang},
booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
year = {2025},
pages = {69147-69159},
volume = {267},
url = {https://mlanthology.org/icml/2025/xu2025icml-hybrid/}
}