The Vitruvian Manifold: Inferring Dense Correspondences for One-Shot Human Pose Estimation
Abstract
Fitting an articulated model to image data is often approached as an optimization over both model pose and model-to-image correspondence. For complex models such as humans, previous work has required a good initialization, or an alternating minimization between correspondence and pose. In this paper we investigate one-shot pose estimation: can we directly infer correspondences using a regression function trained to be invariant to body size and shape, and then optimize the model pose just once? We evaluate on several challenging single-frame data sets containing a wide variety of body poses, shapes, torso rotations, and image cropping. Our experiments demonstrate that one-shot pose estimation achieves state of the art results and runs in real-time.
Cite
Text
Taylor et al. "The Vitruvian Manifold: Inferring Dense Correspondences for One-Shot Human Pose Estimation." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2012. doi:10.1109/CVPR.2012.6247664Markdown
[Taylor et al. "The Vitruvian Manifold: Inferring Dense Correspondences for One-Shot Human Pose Estimation." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2012.](https://mlanthology.org/cvpr/2012/taylor2012cvpr-vitruvian/) doi:10.1109/CVPR.2012.6247664BibTeX
@inproceedings{taylor2012cvpr-vitruvian,
title = {{The Vitruvian Manifold: Inferring Dense Correspondences for One-Shot Human Pose Estimation}},
author = {Taylor, Jonathan and Shotton, Jamie and Sharp, Toby and Fitzgibbon, Andrew W.},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year = {2012},
pages = {103-110},
doi = {10.1109/CVPR.2012.6247664},
url = {https://mlanthology.org/cvpr/2012/taylor2012cvpr-vitruvian/}
}