Precise Detection in Densely Packed Scenes

Abstract

Man-made scenes are often densely packed, containing numerous objects, often identical, positioned in close proximity. We show that precise object detection in such scenes remains a challenging frontier even for state-of-the-art object detectors. We propose a novel, deep-learning based method for precise object detection, designed for such challenging settings. Our contributions include: (1) A layer for estimating the Jaccard index as a detection quality score; (2) a novel EM merging unit, which uses our quality scores to resolve detection overlap ambiguities; finally, (3) an extensive, annotated data set, SKU-110K, representing packed retail environments, released for training and testing under such extreme settings. Detection tests on SKU-110K, and counting tests on the CARPK and PUCPR+, show our method to outperform existing state-of-the-art with substantial margins.

Cite

Text

Goldman et al. "Precise Detection in Densely Packed Scenes." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. doi:10.1109/CVPR.2019.00537

Markdown

[Goldman et al. "Precise Detection in Densely Packed Scenes." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.](https://mlanthology.org/cvpr/2019/goldman2019cvpr-precise/) doi:10.1109/CVPR.2019.00537

BibTeX

@inproceedings{goldman2019cvpr-precise,
  title     = {{Precise Detection in Densely Packed Scenes}},
  author    = {Goldman, Eran and Herzig, Roei and Eisenschtat, Aviv and Goldberger, Jacob and Hassner, Tal},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2019},
  doi       = {10.1109/CVPR.2019.00537},
  url       = {https://mlanthology.org/cvpr/2019/goldman2019cvpr-precise/}
}