Box Aggregation for Proposal Decimation: Last Mile of Object Detection

Abstract

Regions-with-convolutional-neural-network (RCNN) is now a commonly employed object detection pipeline. Its main steps, i.e., proposal generation and convolutional neural network (CNN) feature extraction, have been intensively investigated. We focus on the last step of the system to aggregate thousands of scored box proposals into final object prediction, which we call proposal decimation. We show this step can be enhanced with a very simple box aggregation function by considering statistical properties of proposals with respect to ground truth objects. Our method is with extremely light-weight computation, while it yields an improvement of 3.7% in mAP on PASCAL VOC 2007 test. We explain why it works using some statistics in this paper.

Cite

Text

Liu et al. "Box Aggregation for Proposal Decimation: Last Mile of Object Detection." International Conference on Computer Vision, 2015. doi:10.1109/ICCV.2015.295

Markdown

[Liu et al. "Box Aggregation for Proposal Decimation: Last Mile of Object Detection." International Conference on Computer Vision, 2015.](https://mlanthology.org/iccv/2015/liu2015iccv-box/) doi:10.1109/ICCV.2015.295

BibTeX

@inproceedings{liu2015iccv-box,
  title     = {{Box Aggregation for Proposal Decimation: Last Mile of Object Detection}},
  author    = {Liu, Shu and Lu, Cewu and Jia, Jiaya},
  booktitle = {International Conference on Computer Vision},
  year      = {2015},
  doi       = {10.1109/ICCV.2015.295},
  url       = {https://mlanthology.org/iccv/2015/liu2015iccv-box/}
}