Towards Viewpoint Invariant 3D Human Pose Estimation

Abstract

We propose a viewpoint invariant model for 3D human pose estimation from a single depth image. To achieve this, our discriminative model embeds local regions into a learned viewpoint invariant feature space. Formulated as a multi-task learning problem, our model is able to selectively predict partial poses in the presence of noise and occlusion. Our approach leverages a convolutional and recurrent network architecture with a top-down error feedback mechanism to self-correct previous pose estimates in an end-to-end manner. We evaluate our model on a previously published depth dataset and a newly collected human pose dataset containing 100 K annotated depth images from extreme viewpoints. Experiments show that our model achieves competitive performance on frontal views while achieving state-of-the-art performance on alternate viewpoints.

Cite

Text

Haque et al. "Towards Viewpoint Invariant 3D Human Pose Estimation." European Conference on Computer Vision, 2016. doi:10.1007/978-3-319-46448-0_10

Markdown

[Haque et al. "Towards Viewpoint Invariant 3D Human Pose Estimation." European Conference on Computer Vision, 2016.](https://mlanthology.org/eccv/2016/haque2016eccv-viewpoint/) doi:10.1007/978-3-319-46448-0_10

BibTeX

@inproceedings{haque2016eccv-viewpoint,
  title     = {{Towards Viewpoint Invariant 3D Human Pose Estimation}},
  author    = {Haque, Albert and Peng, Boya and Luo, Zelun and Alahi, Alexandre and Yeung, Serena and Fei-Fei, Li},
  booktitle = {European Conference on Computer Vision},
  year      = {2016},
  pages     = {160-177},
  doi       = {10.1007/978-3-319-46448-0_10},
  url       = {https://mlanthology.org/eccv/2016/haque2016eccv-viewpoint/}
}