Motion Prior Knowledge Learning with Homogeneous Language Descriptions for Moving Infrared Small Target Detection
Abstract
Different from traditional object detection, pure vision is not enough to infrared small target detection, due to small target size and weak background contrast. For promoting detection performance, more target representations are needed. Currently, motion representations have been proved to be one of the most potential feature kinds for infrared small target detection. Existing methods have an obvious weakness, that besides vision features, they could only capture coarse motion representations from temporal domain. With vision features, fine motion representations could be more effective to enhance detection performance. To overcome this weakness, inspired by prevalent vision-language models, we propose the first vision-language framework with motion prior knowledge learning (MoPKL). Breaking through traditional pure-vision modality, it utilizes homogeneous language descriptions, formatted for moving targets, to directionally guide vision channel learning motion prior knowledge. With the facilitation of motion-vision alignment and motion-relation mining, the motion of infrared small targets is further refined by graph attention, to generate more fine motion representations. The extensive experiments on datasets ITSDT-15K and IRDST show that our framework is effective. It could often obviously outperform other methods.
Cite
Text
Chen et al. "Motion Prior Knowledge Learning with Homogeneous Language Descriptions for Moving Infrared Small Target Detection." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I2.32217Markdown
[Chen et al. "Motion Prior Knowledge Learning with Homogeneous Language Descriptions for Moving Infrared Small Target Detection." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/chen2025aaai-motion/) doi:10.1609/AAAI.V39I2.32217BibTeX
@inproceedings{chen2025aaai-motion,
title = {{Motion Prior Knowledge Learning with Homogeneous Language Descriptions for Moving Infrared Small Target Detection}},
author = {Chen, Shengjia and Ji, Luping and Duan, Weiwei and Peng, Shuang and Ye, Mao},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2025},
pages = {2186-2194},
doi = {10.1609/AAAI.V39I2.32217},
url = {https://mlanthology.org/aaai/2025/chen2025aaai-motion/}
}