Simultaneous Pose Estimation and Camera Calibration from Multiple Views

Abstract

We present an algorithm to estimate the body pose of a walking person given synchronized video input from multiple uncalibrated cameras. We construct an appearance model of human walking motion by generating examples from the space of body poses and camera locations, and clustering them using expectation-maximization. Given a segmented input video sequence, we find the closest matching appearance cluster for each silhouette and use the sequence of matched clusters to extrapolate the position of the camera with respect to the person's direction of motion. For each frame, the matching cluster also provides an estimate of the walking phase. We combine these estimates from all views and find the most likely sequence of walking poses using a cyclical, feed-forward hidden Markov model. Our algorithm requires no manual initialization and no prior knowledge about the locations of the cameras.

Cite

Text

Izo and Grimson. "Simultaneous Pose Estimation and Camera Calibration from Multiple Views." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2004. doi:10.1109/CVPR.2004.439

Markdown

[Izo and Grimson. "Simultaneous Pose Estimation and Camera Calibration from Multiple Views." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2004.](https://mlanthology.org/cvpr/2004/izo2004cvpr-simultaneous/) doi:10.1109/CVPR.2004.439

BibTeX

@inproceedings{izo2004cvpr-simultaneous,
  title     = {{Simultaneous Pose Estimation and Camera Calibration from Multiple Views}},
  author    = {Izo, Tomás and Grimson, W. Eric L.},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2004},
  pages     = {14},
  doi       = {10.1109/CVPR.2004.439},
  url       = {https://mlanthology.org/cvpr/2004/izo2004cvpr-simultaneous/}
}