Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images
Abstract
Lung cancer is a leading cause of cancer-related deaths globally. PET-CT is crucial for imaging lung tumors, providing essential metabolic and anatomical information, while it faces challenges such as poor image quality, motion artifacts, and complex tumor morphology. Deep learning-based models are expected to address these problems, however, existing small-scale and private datasets limit significant performance improvements for these methods. Hence, we introduce a large-scale PET-CT lung tumor segmentation dataset, termed PCLT20K, which comprises 21,930 pairs of PET-CT images from 605 patients. Furthermore, we propose a cross-modal interactive perception network with Mamba (CIPA) for lung tumor segmentation in PET-CT images. Specifically, we design a channel-wise rectification module (CRM) that implements a channel state space block across multi-modal features to learn correlated representations and helps filter out modality-specific noise. A dynamic cross-modality interaction module (DCIM) is designed to effectively integrate position and context information, which employs PET images to learn regional position information and serves as a bridge to assist in modeling the relationships between local features of CT images. Extensive experiments on a comprehensive benchmark demonstrate the effectiveness of our CIPA compared to the current state-of-the-art segmentation methods. We hope our research can provide more exploration opportunities for medical image segmentation. The dataset and code are available at https://github.com/mj129/CIPA.
Cite
Text
Mei et al. "Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.01459Markdown
[Mei et al. "Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/mei2025cvpr-crossmodal/) doi:10.1109/CVPR52734.2025.01459BibTeX
@inproceedings{mei2025cvpr-crossmodal,
title = {{Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images}},
author = {Mei, Jie and Lin, Chenyu and Qiu, Yu and Wang, Yaonan and Zhang, Hui and Wang, Ziyang and Dai, Dong},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2025},
pages = {15653-15662},
doi = {10.1109/CVPR52734.2025.01459},
url = {https://mlanthology.org/cvpr/2025/mei2025cvpr-crossmodal/}
}