Famba-V: Fast Vision Mamba with Cross-Layer Token Fusion

Shen, Hui; Wan, Zhongwei; Wang, Xin; Zhang, Mi

doi:10.1007/978-3-031-91979-4_20

Famba-V: Fast Vision Mamba with Cross-Layer Token Fusion

Hui Shen, Zhongwei Wan, Xin Wang, Mi Zhang

ECCVW 2024 pp. 268-278

doi:10.1007/978-3-031-91979-4_20 /eccvw/2024/shen2024eccvw-fambav/

Abstract

Mamba and Vision Mamba (Vim) models have shown their potential as an alternative to methods based on Transformer architecture. This work introduces F ast M amba for V ision ( Famba-V ), a cross-layer token fusion technique to enhance the training efficiency of Vim models. The key idea of Famba-V is to identify and fuse similar tokens across different Vim layers based on a suit of cross-layer strategies instead of simply applying token fusion uniformly across all the layers that existing works propose. We evaluate the performance of Famba-V on CIFAR-100. Our results show that Famba-V is able to enhance the training efficiency of Vim models by reducing both training time and peak memory usage during training. Moreover, the proposed cross-layer strategies allow Famba-V to deliver superior accuracy-efficiency trade-offs. These results all together demonstrate Famba-V as a promising efficiency enhancement technique for Vim models.

PDF ECCVW Semantic Scholar

Cite

Text

Shen et al. "Famba-V: Fast Vision Mamba with Cross-Layer Token Fusion." European Conference on Computer Vision Workshops, 2024. doi:10.1007/978-3-031-91979-4_20

Markdown

[Shen et al. "Famba-V: Fast Vision Mamba with Cross-Layer Token Fusion." European Conference on Computer Vision Workshops, 2024.](https://mlanthology.org/eccvw/2024/shen2024eccvw-fambav/) doi:10.1007/978-3-031-91979-4_20

BibTeX

@inproceedings{shen2024eccvw-fambav,
  title     = {{Famba-V: Fast Vision Mamba with Cross-Layer Token Fusion}},
  author    = {Shen, Hui and Wan, Zhongwei and Wang, Xin and Zhang, Mi},
  booktitle = {European Conference on Computer Vision Workshops},
  year      = {2024},
  pages     = {268-278},
  doi       = {10.1007/978-3-031-91979-4_20},
  url       = {https://mlanthology.org/eccvw/2024/shen2024eccvw-fambav/}
}