PandaNet: Anchor-Based Single-Shot Multi-Person 3D Pose Estimation

Benzine, Abdallah; Chabot, Florian; Luvison, Bertrand; Pham, Quoc Cuong; Achard, Catherine

doi:10.1109/CVPR42600.2020.00689

PandaNet: Anchor-Based Single-Shot Multi-Person 3D Pose Estimation

Abdallah Benzine, Florian Chabot, Bertrand Luvison, Quoc Cuong Pham, Catherine Achard

CVPR 2020

doi:10.1109/CVPR42600.2020.00689 /cvpr/2020/benzine2020cvpr-pandanet/

Abstract

Recently, several deep learning models have been proposed for 3D human pose estimation. Nevertheless, most of these approaches only focus on the single-person case or estimate 3D pose of a few people at high resolution. Furthermore, many applications such as autonomous driving or crowd analysis require pose estimation of a large number of people possibly at low-resolution. In this work, we present PandaNet (Pose estimAtioN and Dectection Anchor-based Network), a new single-shot, anchor-based and multi-person 3D pose estimation approach. The proposed model performs bounding box detection and, for each detected person, 2D and 3D pose regression into a single forward pass. It does not need any post-processing to regroup joints since the network predicts a full 3D pose for each bounding box and allows the pose estimation of a possibly large number of people at low resolution. To manage people overlapping, we introduce a Pose-Aware Anchor Selection strategy. Moreover, as imbalance exists between different people sizes in the image, and joints coordinates have different uncertainties depending on these sizes, we propose a method to automatically optimize weights associated to different people scales and joints for efficient training. PandaNet surpasses previous single-shot methods on several challenging datasets: a multi-person urban virtual but very realistic dataset (JTA Dataset), and two real world 3D multi-person datasets (CMU Panoptic and MuPoTS-3D).

PDF CVPR Semantic Scholar

Cite

Text

Benzine et al. "PandaNet: Anchor-Based Single-Shot Multi-Person 3D Pose Estimation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020. doi:10.1109/CVPR42600.2020.00689

Markdown

[Benzine et al. "PandaNet: Anchor-Based Single-Shot Multi-Person 3D Pose Estimation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.](https://mlanthology.org/cvpr/2020/benzine2020cvpr-pandanet/) doi:10.1109/CVPR42600.2020.00689

BibTeX

@inproceedings{benzine2020cvpr-pandanet,
  title     = {{PandaNet: Anchor-Based Single-Shot Multi-Person 3D Pose Estimation}},
  author    = {Benzine, Abdallah and Chabot, Florian and Luvison, Bertrand and Pham, Quoc Cuong and Achard, Catherine},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2020},
  doi       = {10.1109/CVPR42600.2020.00689},
  url       = {https://mlanthology.org/cvpr/2020/benzine2020cvpr-pandanet/}
}