OpenRSD: Towards Open-Prompts for Object Detection in Remote Sensing Images

Abstract

Remote sensing object detection has made significant progress, but most studies still focus on closed-set detection, limiting generalization across diverse datasets. Open-vocabulary object detection (OVD) provides a solution by leveraging multimodal associations between text prompts and visual features. However, existing OVD methods for remote sensing (RS) images are constrained by small-scale datasets and fail to address the unique challenges of remote sensing interpretation, include oriented object detection and the need for both high precision and real-time performance in diverse scenarios. To tackle these challenges, we propose OpenRSD, a universal open-prompt RS object detection framework. OpenRSD supports multimodal prompts and integrates multi-task detection heads to balance accuracy and real-time requirements. Additionally, we design a multi-stage training pipeline to enhance the generalization of model. Evaluated on seven public datasets, OpenRSD demonstrates superior performance in oriented and horizontal bounding box detection, with real-time inference capabilities suitable for large-scale RS image analysis. Compared to YOLO-World, OpenRSD exhibits an 8.7% higher average precision and achieves an inference speed of 20.8 FPS. Codes and models will be released.

Cite

Text

Huang et al. "OpenRSD: Towards Open-Prompts for Object Detection in Remote Sensing Images." International Conference on Computer Vision, 2025.

Markdown

[Huang et al. "OpenRSD: Towards Open-Prompts for Object Detection in Remote Sensing Images." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/huang2025iccv-openrsd/)

BibTeX

@inproceedings{huang2025iccv-openrsd,
  title     = {{OpenRSD: Towards Open-Prompts for Object Detection in Remote Sensing Images}},
  author    = {Huang, Ziyue and Feng, Yongchao and Liu, Ziqi and Yang, Shuai and Liu, Qingjie and Wang, Yunhong},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {8384-8394},
  url       = {https://mlanthology.org/iccv/2025/huang2025iccv-openrsd/}
}