Learning Globally Optimized Object Detector via Policy Gradient

Abstract

In this paper, we propose a simple yet effective method to learn globally optimized detector for object detection, which is a simple modification to the standard cross-entropy gradient inspired by the REINFORCE algorithm. In our approach, the cross-entropy gradient is adaptively adjusted according to overall mean Average Precision (mAP) of the current state for each detection candidate, which leads to more effective gradient and global optimization of detection results, and brings no computational overhead. Benefiting from more precise gradients produced by the global optimization method, our framework significantly improves state-of-the-art object detectors. Furthermore, since our method is based on scores and bounding boxes without modification on the architecture of object detector, it can be easily applied to off-the-shelf modern object detection frameworks.

Cite

Text

Rao et al. "Learning Globally Optimized Object Detector via Policy Gradient." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018. doi:10.1109/CVPR.2018.00648

Markdown

[Rao et al. "Learning Globally Optimized Object Detector via Policy Gradient." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018.](https://mlanthology.org/cvpr/2018/rao2018cvpr-learning/) doi:10.1109/CVPR.2018.00648

BibTeX

@inproceedings{rao2018cvpr-learning,
  title     = {{Learning Globally Optimized Object Detector via Policy Gradient}},
  author    = {Rao, Yongming and Lin, Dahua and Lu, Jiwen and Zhou, Jie},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2018},
  doi       = {10.1109/CVPR.2018.00648},
  url       = {https://mlanthology.org/cvpr/2018/rao2018cvpr-learning/}
}