Spatial Attention for Multi-Scale Feature Refinement for Object Detection

Abstract

Scale variation is one of the primary challenges in the object detection, existing in both inter-class and intra-class instances, especially on the drone platform. The latest methods focus on feature pyramid for detecting objects at different scales. In this work, we propose two techniques to refine multi-scale features for detecting various-scale instances in FPN-based Network. A Receptive Field Expansion Block (RFEB) is designed to increase the receptive field size for high-level semantic features, then the generated features are passed through a Spatial-Refinement Module (SRM) to repair the spatial details of multi-scale objects in images before summation by the lateral connection. To evaluate its effectiveness, we conduct experiments on VisDrone2019 benchmark dataset and achieve impressive improvement. Meanwhile, results on PASCAL VOC and MS COCO datasets show that our model is able to reach the competitive performance.

Cite

Text

Wang et al. "Spatial Attention for Multi-Scale Feature Refinement for Object Detection." IEEE/CVF International Conference on Computer Vision Workshops, 2019. doi:10.1109/ICCVW.2019.00014

Markdown

[Wang et al. "Spatial Attention for Multi-Scale Feature Refinement for Object Detection." IEEE/CVF International Conference on Computer Vision Workshops, 2019.](https://mlanthology.org/iccvw/2019/wang2019iccvw-spatial/) doi:10.1109/ICCVW.2019.00014

BibTeX

@inproceedings{wang2019iccvw-spatial,
  title     = {{Spatial Attention for Multi-Scale Feature Refinement for Object Detection}},
  author    = {Wang, Haoran and Wang, Zexin and Jia, Meixia and Li, Aijin and Feng, Tuo and Zhang, Wenhua and Jiao, Licheng},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
  year      = {2019},
  pages     = {64-72},
  doi       = {10.1109/ICCVW.2019.00014},
  url       = {https://mlanthology.org/iccvw/2019/wang2019iccvw-spatial/}
}