End-to-End Integration of a Convolution Network, Deformable Parts Model and Non-Maximum Suppression
Abstract
Deformable Parts Models and Convolutional Networks each have achieved notable performance in object detection. Yet these two approaches find their strengths in complementary areas: DPMs are well-versed in object composition, modeling fine-grained spatial relationships between parts; likewise, ConvNets are adept at producing powerful image features, having been discriminatively trained directly on the pixels. In this paper, we propose a new model that combines these two approaches, obtaining the advantages of each. We train this model using a new structured loss function that considers all bounding boxes within an image, rather than isolated object instances. This enables the non-maximal suppression (NMS) operation, previously treated as a separate post-processing stage, to be integrated into the model. This allows for discriminative training of our combined Convnet + DPM + NMS model in end-to-end fashion. We evaluate our system on PASCAL VOC 2007 and 2011 datasets, achieving competitive results on both bench- marks.
Cite
Text
Wan et al. "End-to-End Integration of a Convolution Network, Deformable Parts Model and Non-Maximum Suppression." Conference on Computer Vision and Pattern Recognition, 2015.Markdown
[Wan et al. "End-to-End Integration of a Convolution Network, Deformable Parts Model and Non-Maximum Suppression." Conference on Computer Vision and Pattern Recognition, 2015.](https://mlanthology.org/cvpr/2015/wan2015cvpr-endtoend/)BibTeX
@inproceedings{wan2015cvpr-endtoend,
title = {{End-to-End Integration of a Convolution Network, Deformable Parts Model and Non-Maximum Suppression}},
author = {Wan, Li and Eigen, David and Fergus, Rob},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2015},
url = {https://mlanthology.org/cvpr/2015/wan2015cvpr-endtoend/}
}