Generalizable Multi-Camera 3D Pedestrian Detection

Abstract

We present a multi-camera 3D pedestrian detection method that does not need to train using data from the target scene. We estimate pedestrian location on the ground plane using a novel heuristic based on human body poses and person’s bounding boxes from an off-the-shelf monocular detector. We then project these locations onto the world ground plane and fuse them with a new formulation of a clique cover problem. We also propose an optional step for exploiting pedestrian appearance during fusion by using a domain-generalizable person re-identification model. We evaluated the proposed approach on the challenging WILDTRACK dataset. It obtained a MODA of 0.569 and an F-score of 0.78, superior to state-of-the-art generalizable detection techniques.

Cite

Text

Lima et al. "Generalizable Multi-Camera 3D Pedestrian Detection." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2021. doi:10.1109/CVPRW53098.2021.00135

Markdown

[Lima et al. "Generalizable Multi-Camera 3D Pedestrian Detection." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2021.](https://mlanthology.org/cvprw/2021/lima2021cvprw-generalizable/) doi:10.1109/CVPRW53098.2021.00135

BibTeX

@inproceedings{lima2021cvprw-generalizable,
  title     = {{Generalizable Multi-Camera 3D Pedestrian Detection}},
  author    = {Lima, João Paulo and Roberto, Rafael and Figueiredo, Lucas Silva and Simões, Francisco and Teichrieb, Veronica},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2021},
  pages     = {1232-1240},
  doi       = {10.1109/CVPRW53098.2021.00135},
  url       = {https://mlanthology.org/cvprw/2021/lima2021cvprw-generalizable/}
}