The Vitruvian Manifold: Inferring Dense Correspondences for One-Shot Human Pose Estimation

Abstract

Fitting an articulated model to image data is often approached as an optimization over both model pose and model-to-image correspondence. For complex models such as humans, previous work has required a good initialization, or an alternating minimization between correspondence and pose. In this paper we investigate one-shot pose estimation: can we directly infer correspondences using a regression function trained to be invariant to body size and shape, and then optimize the model pose just once? We evaluate on several challenging single-frame data sets containing a wide variety of body poses, shapes, torso rotations, and image cropping. Our experiments demonstrate that one-shot pose estimation achieves state of the art results and runs in real-time.

Cite

Text

Taylor et al. "The Vitruvian Manifold: Inferring Dense Correspondences for One-Shot Human Pose Estimation." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2012. doi:10.1109/CVPR.2012.6247664

Markdown

[Taylor et al. "The Vitruvian Manifold: Inferring Dense Correspondences for One-Shot Human Pose Estimation." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2012.](https://mlanthology.org/cvpr/2012/taylor2012cvpr-vitruvian/) doi:10.1109/CVPR.2012.6247664

BibTeX

@inproceedings{taylor2012cvpr-vitruvian,
  title     = {{The Vitruvian Manifold: Inferring Dense Correspondences for One-Shot Human Pose Estimation}},
  author    = {Taylor, Jonathan and Shotton, Jamie and Sharp, Toby and Fitzgibbon, Andrew W.},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2012},
  pages     = {103-110},
  doi       = {10.1109/CVPR.2012.6247664},
  url       = {https://mlanthology.org/cvpr/2012/taylor2012cvpr-vitruvian/}
}