IRSAM: Advancing Segment Anything Model for Infrared Small Target Detection
Abstract
The recent Segment Anything Model (SAM) is a significant advancement in natural image segmentation, exhibiting potent zero-shot performance suitable for various downstream image segmentation tasks. However, directly utilizing the pretrained SAM for Infrared Small Target Detection (IRSTD) task falls short in achieving satisfying performance due to a notable domain gap between natural and infrared images. Unlike a visible light camera, a thermal imager reveals an object’s temperature distribution by capturing infrared radiation. Small targets often show a subtle temperature transition at the object’s boundaries. To address this issue, we propose the IRSAM model for IRSTD, which improves SAM’s encoder-decoder architecture to learn better feature representation of infrared small objects. Specifically, we design a Perona-Malik diffusion (PMD)-based block and incorporate it into multiple levels of SAM’s encoder to help it capture essential structural features while suppressing noise. Additionally, we devise a Granularity-Aware Decoder (GAD) to fuse the multi-granularity feature from the encoder to capture structural information that may be lost in long-distance modeling. Extensive experiments on the public datasets, including NUAA-SIRST, NUDT-SIRST, and IRSTD-1K, validate the design choice of IRSAM and its significant superiority over representative state-of-the-art methods. The source code are available at: github.com/IPIC-Lab/IRSAM.
Cite
Text
Zhang et al. "IRSAM: Advancing Segment Anything Model for Infrared Small Target Detection." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72855-6_14Markdown
[Zhang et al. "IRSAM: Advancing Segment Anything Model for Infrared Small Target Detection." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/zhang2024eccv-irsam/) doi:10.1007/978-3-031-72855-6_14BibTeX
@inproceedings{zhang2024eccv-irsam,
title = {{IRSAM: Advancing Segment Anything Model for Infrared Small Target Detection}},
author = {Zhang, Mingjin and Wang, Yuchun and Guo, Jie and Li, Yunsong and Gao, Xinbo and Zhang, Jing},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2024},
doi = {10.1007/978-3-031-72855-6_14},
url = {https://mlanthology.org/eccv/2024/zhang2024eccv-irsam/}
}