Continual Multi-View Clustering with Consistent Anchor Guidance
Abstract
The combination of Spiking Neural Networks (SNNs) with Vision Transformer architectures has attracted significant attention due to the great potential for energy-efficient and high-performance computing paradigms. However, a substantial performance gap still exists between SNN-based and ANN-based transformer architectures. While existing methods propose spiking self-attention mechanisms that are successfully combined with SNNs, the overall architectures proposed by these methods suffer from a bottleneck in effectively extracting features from different image scales. In this paper, we address this issue and propose MSVIT, a novel spike-driven Transformer architecture, which firstly uses multi-scale spiking attention (MSSA) to enrich the capability of spiking attention blocks. We validate our approach across various main data sets. The experimental results indicate that our MSVIT outperforms existing SNN-based models, positioning itself as a state-of-the-art solution among NN-transformer architectures. The codes are available at https://github.com/Nanhu-AI-Lab/MSViT.
Cite
Text
Zhang et al. "Continual Multi-View Clustering with Consistent Anchor Guidance." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/601Markdown
[Zhang et al. "Continual Multi-View Clustering with Consistent Anchor Guidance." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/zhang2024ijcai-continual/) doi:10.24963/ijcai.2024/601BibTeX
@inproceedings{zhang2024ijcai-continual,
title = {{Continual Multi-View Clustering with Consistent Anchor Guidance}},
author = {Zhang, Chao and Xu, Deng and Jia, Xiuyi and Chen, Chunlin and Li, Huaxiong},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2024},
pages = {5434-5442},
doi = {10.24963/ijcai.2024/601},
url = {https://mlanthology.org/ijcai/2024/zhang2024ijcai-continual/}
}