NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection

Abstract

Current state-of-the-art convolutional architectures for object detection are manually designed. Here we aim to learn a better architecture of feature pyramid network for object detection. We adopt Neural Architecture Search and discover a new feature pyramid architecture in a novel scalable search space covering all cross-scale connections. The discovered architecture, named NAS-FPN, consists of a combination of top-down and bottom-up connections to fuse features across scales. NAS-FPN, combined with various backbone models in the RetinaNet framework, achieves better accuracy and latency tradeoff compared to state-of-the-art object detection models. NAS-FPN improves mobile detection accuracy by 2 AP compared to state-of-the-art SSDLite with MobileNetV2 model in [32] and achieves 48.3 AP which surpasses Mask R-CNN [10] detection accuracy with less computation time.

Cite

Text

Ghiasi et al. "NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. doi:10.1109/CVPR.2019.00720

Markdown

[Ghiasi et al. "NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.](https://mlanthology.org/cvpr/2019/ghiasi2019cvpr-nasfpn/) doi:10.1109/CVPR.2019.00720

BibTeX

@inproceedings{ghiasi2019cvpr-nasfpn,
  title     = {{NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection}},
  author    = {Ghiasi, Golnaz and Lin, Tsung-Yi and Le, Quoc V.},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2019},
  doi       = {10.1109/CVPR.2019.00720},
  url       = {https://mlanthology.org/cvpr/2019/ghiasi2019cvpr-nasfpn/}
}