Context and Observation Driven Latent Variable Model for Human Pose Estimation

Abstract

Current approaches to pose estimation and tracking can be classified into two categories: generative and discriminative. While generative approaches can accurately determine human pose from image observations, they are computationally expensive due to search in the high dimensional human pose space. On the other hand, discriminative approaches do not generalize well, but are computationally efficient. We present a hybrid model that combines the strengths of the two in an integrated learning and inference framework. We extend the Gaussian process latent variable model (GPLVM) to include an embedding from observation space (the space of image features) to the latent space. GPLVM is a generative model, but the inclusion of this mapping provides a discriminative component, making the model observation driven. Observation Driven GPLVM (OD-GPLVM) not only provides a faster inference approach, but also more accurate estimates (compared to GPLVM) in cases where dynamics are not sufficient for the initialization of search in the latent space. We also extend OD-GPLVM to learn and estimate poses from parameterized actions/gestures. Parameterized gestures are actions which exhibit large systematic variation in joint angle space for different instances due to difference in contextual variables. For example, the joint angles in a forehand tennis shot are function of the height of the ball (Figure 2). We learn these systematic variations as a function of the contextual variables. We then present an approach to use information from scene/objects to provide context for human pose estimation for such parameterized actions.

Cite

Text

Gupta et al. "Context and Observation Driven Latent Variable Model for Human Pose Estimation." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2008. doi:10.1109/CVPR.2008.4587511

Markdown

[Gupta et al. "Context and Observation Driven Latent Variable Model for Human Pose Estimation." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2008.](https://mlanthology.org/cvpr/2008/gupta2008cvpr-context/) doi:10.1109/CVPR.2008.4587511

BibTeX

@inproceedings{gupta2008cvpr-context,
  title     = {{Context and Observation Driven Latent Variable Model for Human Pose Estimation}},
  author    = {Gupta, Abhinav and Chen, Trista P. and Chen, Francine and Kimber, Don and Davis, Larry S.},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2008},
  doi       = {10.1109/CVPR.2008.4587511},
  url       = {https://mlanthology.org/cvpr/2008/gupta2008cvpr-context/}
}