Shadow Puppetry

Abstract

The mapping between 3D body poses and 2D shadows is fundamentally many-to-many and defeats regression methods, even with windowed context. We show how to learn a function between paths in the two systems, resolving ambiguities by integrating information over the entire length of a sequence. The basis of this function is a configural and dynamical manifold that summarizes the target system's behaviour. This manifold can be modeled from data with a hidden Markov model having special topological properties that we obtain via entropy minimization. Inference is then a matter of solving for the geodesic on the manifold that best explains the evidence in the cue sequence. We give a closed-form maximum a posteriori solution for geodesics through the learned density space, thereby obtaining optimal paths over the dynamical manifold. These methods give a completely general way to perform inference over time-series; in vision they support analysis, recognition, classification and synthesis of behaviours in linear time. We demonstrate with a prototype that infers 3D from monocular monochromatic sequences (e.g., back-subtractions), without using any articulatory body model. The framework readily accommodates multiple cameras and other sources of evidence such as optical flow or feature tracking.

Cite

Text

Brand. "Shadow Puppetry." IEEE/CVF International Conference on Computer Vision, 1999. doi:10.1109/ICCV.1999.790422

Markdown

[Brand. "Shadow Puppetry." IEEE/CVF International Conference on Computer Vision, 1999.](https://mlanthology.org/iccv/1999/brand1999iccv-shadow/) doi:10.1109/ICCV.1999.790422

BibTeX

@inproceedings{brand1999iccv-shadow,
  title     = {{Shadow Puppetry}},
  author    = {Brand, Matthew},
  booktitle = {IEEE/CVF International Conference on Computer Vision},
  year      = {1999},
  pages     = {1237-1244},
  doi       = {10.1109/ICCV.1999.790422},
  url       = {https://mlanthology.org/iccv/1999/brand1999iccv-shadow/}
}