Unsupervised Geometry-Aware Representation for 3D Human Pose Estimation

Abstract

Modern 3D human pose estimation techniques rely on deep networks, which require large amounts of training data. While weakly-supervised methods require less supervision, by utilizing 2D poses or multi-view imagery without annotations, they still need a sufficiently large set of samples with 3D annotations for learning to succeed. In this paper, we propose to overcome this problem by learning a geometry-aware body representation from multi-view images without annotations. To this end, we use an encoder-decoder that predicts an image from one viewpoint given an image from another viewpoint. Because this representation encodes 3D geometry, using it in a semi-supervised setting makes it easier to learn a mapping from it to 3D human pose. As evidenced by our experiments, our approach significantly outperforms fully-supervised methods given the same amount of labeled data, and improves over other semi-supervised methods while using as little as 1% of the labeled data.

Cite

Text

Rhodin et al. "Unsupervised Geometry-Aware Representation for 3D Human Pose Estimation." Proceedings of the European Conference on Computer Vision (ECCV), 2018. doi:10.1007/978-3-030-01249-6_46

Markdown

[Rhodin et al. "Unsupervised Geometry-Aware Representation for 3D Human Pose Estimation." Proceedings of the European Conference on Computer Vision (ECCV), 2018.](https://mlanthology.org/eccv/2018/rhodin2018eccv-unsupervised/) doi:10.1007/978-3-030-01249-6_46

BibTeX

@inproceedings{rhodin2018eccv-unsupervised,
  title     = {{Unsupervised Geometry-Aware Representation for 3D Human Pose Estimation}},
  author    = {Rhodin, Helge and Salzmann, Mathieu and Fua, Pascal},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2018},
  doi       = {10.1007/978-3-030-01249-6_46},
  url       = {https://mlanthology.org/eccv/2018/rhodin2018eccv-unsupervised/}
}