Efficient Featurized Image Pyramid Network for Single Shot Detector

Abstract

Single-stage object detectors have recently gained popularity due to their combined advantage of high detection accuracy and real-time speed. However, while promising results have been achieved by these detectors on standard-sized objects, their performance on small objects is far from satisfactory. To detect very small/large objects, classical pyramid representation can be exploited, where an image pyramid is used to build a feature pyramid (featurized image pyramid), enabling detection across a range of scales. Existing single-stage detectors avoid such a featurized image pyramid representation due to its memory and time complexity. In this paper, we introduce a light-weight architecture to efficiently produce featurized image pyramid in a single-stage detection framework. The resulting multi-scale features are then injected into the prediction layers of the detector using an attention module. The performance of our detector is validated on two benchmarks: PASCAL VOC and MS COCO. For a 300x300 input, our detector operates at 111 frames per second (FPS) on a Titan X GPU, providing state-of-the-art detection accuracy on PASCAL VOC 2007 testset. On the MS COCO testset, our detector achieves state-of-the-art results surpassing all existing single-stage methods in the case of single-scale inference.

Cite

Text

Pang et al. "Efficient Featurized Image Pyramid Network for Single Shot Detector." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. doi:10.1109/CVPR.2019.00751

Markdown

[Pang et al. "Efficient Featurized Image Pyramid Network for Single Shot Detector." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.](https://mlanthology.org/cvpr/2019/pang2019cvpr-efficient/) doi:10.1109/CVPR.2019.00751

BibTeX

@inproceedings{pang2019cvpr-efficient,
  title     = {{Efficient Featurized Image Pyramid Network for Single Shot Detector}},
  author    = {Pang, Yanwei and Wang, Tiancai and Anwer, Rao Muhammad and Khan, Fahad Shahbaz and Shao, Ling},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2019},
  doi       = {10.1109/CVPR.2019.00751},
  url       = {https://mlanthology.org/cvpr/2019/pang2019cvpr-efficient/}
}