How to Fully Exploit the Abilities of Aerial Image Detectors

Abstract

Detecting objects in aerial images usually faces two major challenges: (1) detecting difficult targets (e.g., small objects, objects that are interfered by the background, or various orientation of the objects, etc.); (2) the imbalance problem inherent in object detection (e.g., imbalanced quantity in different categories, imbalanced sampling method, or imbalanced loss between classification and localization, etc.). Due to these challenges, detectors are often unable to perform the most effective training and testing. In this paper, we propose a simple but effective framework to address these concerns. First, we propose an adaptive cropping method based on a Difficult Region Estimation Network (DREN) to enhance the detection of the difficult targets, which allows the detector to fully exploit its performance during the testing phase. Second, we use the well-trained DREN to generate more diverse and representative training images, which is effective in enhancing the training set. Besides, in order to alleviate the impact of imbalance during training, we add a balance module in which the IoU balanced sampling method and balanced L1 loss are adopted. Finally, we evaluate our method on two aerial image datasets. Without bells and whistles, our framework achieves 8.0 points and 3.3 points higher Average Precision (AP) than the corresponding baselines on VisDrone and UAVDT, respectively.

Cite

Text

Zhang et al. "How to Fully Exploit the Abilities of Aerial Image Detectors." IEEE/CVF International Conference on Computer Vision Workshops, 2019. doi:10.1109/ICCVW.2019.00007

Markdown

[Zhang et al. "How to Fully Exploit the Abilities of Aerial Image Detectors." IEEE/CVF International Conference on Computer Vision Workshops, 2019.](https://mlanthology.org/iccvw/2019/zhang2019iccvw-fully/) doi:10.1109/ICCVW.2019.00007

BibTeX

@inproceedings{zhang2019iccvw-fully,
  title     = {{How to Fully Exploit the Abilities of Aerial Image Detectors}},
  author    = {Zhang, Junyi and Huang, Junying and Chen, Xuankun and Zhang, Dongyu},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
  year      = {2019},
  pages     = {1-8},
  doi       = {10.1109/ICCVW.2019.00007},
  url       = {https://mlanthology.org/iccvw/2019/zhang2019iccvw-fully/}
}