Count- and Similarity-Aware R-CNN for Pedestrian Detection

Abstract

Recent pedestrian detection methods generally rely on additional supervision, such as visible bounding-box annotations, to handle heavy occlusions. We propose an approach that leverages pedestrian count and proposal similarity information within a two-stage pedestrian detection framework. Both pedestrian count and proposal similarity are derived from standard full-body annotations commonly used to train pedestrian detectors. We introduce a count-weighted detection loss function that assigns higher weights to the detection errors occurring at highly overlapping pedestrians. The proposed loss function is utilized at both stages of the two-stage detector. We further introduce a count-and-similarity branch within the two-stage detection framework, which predicts pedestrian count as well as proposal similarity. Lastly, we introduce a count and similarity-aware NMS strategy to identify distinct proposals. Our approach requires neither part information nor visible bounding-box annotations. Experiments are performed on the CityPersons and CrowdHuman datasets. Our method sets a new state-of-the-art on both datasets. Further, it achieves an absolute gain of 2.4\% over the current state-of-the-art, in terms of log-average miss rate, on the heavily occluded ( extbf{HO}) set of CityPersons test set. Finally, we demonstrate the applicability of our approach for the problem of human instance segmentation. Code and models are available at: https://github.com/Leotju/CaSe .

Cite

Text

Xie et al. "Count- and Similarity-Aware R-CNN for Pedestrian Detection." Proceedings of the European Conference on Computer Vision (ECCV), 2020. doi:10.1007/978-3-030-58520-4_6

Markdown

[Xie et al. "Count- and Similarity-Aware R-CNN for Pedestrian Detection." Proceedings of the European Conference on Computer Vision (ECCV), 2020.](https://mlanthology.org/eccv/2020/xie2020eccv-count/) doi:10.1007/978-3-030-58520-4_6

BibTeX

@inproceedings{xie2020eccv-count,
  title     = {{Count- and Similarity-Aware R-CNN for Pedestrian Detection}},
  author    = {Xie, Jin and Cholakkal, Hisham and Anwer, Rao Muhammad and Khan, Fahad Shahbaz and Pang, Yanwei and Shao, Ling and Shah, Mubarak},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2020},
  doi       = {10.1007/978-3-030-58520-4_6},
  url       = {https://mlanthology.org/eccv/2020/xie2020eccv-count/}
}