3D Human Pose Estimation via Explicit Compositional Depth Maps

Abstract

In this work, we tackle the problem of estimating 3D human pose in camera space from a monocular image. First, we propose to use densely-generated limb depth maps to ease the learning of body joints depth, which are well aligned with image cues. Then, we design a lifting module from 2D pixel coordinates to 3D camera coordinates which explicitly takes the depth values as inputs, and is aligned with camera perspective projection model. We show our method achieves superior performance on large-scale 3D pose datasets Human3.6M and MPI-INF-3DHP, and sets the new state-of-the-art.

Cite

Text

Wu and Xiao. "3D Human Pose Estimation via Explicit Compositional Depth Maps." AAAI Conference on Artificial Intelligence, 2020. doi:10.1609/AAAI.V34I07.6923

Markdown

[Wu and Xiao. "3D Human Pose Estimation via Explicit Compositional Depth Maps." AAAI Conference on Artificial Intelligence, 2020.](https://mlanthology.org/aaai/2020/wu2020aaai-d/) doi:10.1609/AAAI.V34I07.6923

BibTeX

@inproceedings{wu2020aaai-d,
  title     = {{3D Human Pose Estimation via Explicit Compositional Depth Maps}},
  author    = {Wu, Haiping and Xiao, Bin},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2020},
  pages     = {12378-12385},
  doi       = {10.1609/AAAI.V34I07.6923},
  url       = {https://mlanthology.org/aaai/2020/wu2020aaai-d/}
}