Shadow Puppetry
Abstract
The mapping between 3D body poses and 2D shadows is fundamentally many-to-many and defeats regression methods, even with windowed context. We show how to learn a function between paths in the two systems, resolving ambiguities by integrating information over the entire length of a sequence. The basis of this function is a configural and dynamical manifold that summarizes the target system's behaviour. This manifold can be modeled from data with a hidden Markov model having special topological properties that we obtain via entropy minimization. Inference is then a matter of solving for the geodesic on the manifold that best explains the evidence in the cue sequence. We give a closed-form maximum a posteriori solution for geodesics through the learned density space, thereby obtaining optimal paths over the dynamical manifold. These methods give a completely general way to perform inference over time-series; in vision they support analysis, recognition, classification and synthesis of behaviours in linear time. We demonstrate with a prototype that infers 3D from monocular monochromatic sequences (e.g., back-subtractions), without using any articulatory body model. The framework readily accommodates multiple cameras and other sources of evidence such as optical flow or feature tracking.
Cite
Text
Brand. "Shadow Puppetry." IEEE/CVF International Conference on Computer Vision, 1999. doi:10.1109/ICCV.1999.790422Markdown
[Brand. "Shadow Puppetry." IEEE/CVF International Conference on Computer Vision, 1999.](https://mlanthology.org/iccv/1999/brand1999iccv-shadow/) doi:10.1109/ICCV.1999.790422BibTeX
@inproceedings{brand1999iccv-shadow,
title = {{Shadow Puppetry}},
author = {Brand, Matthew},
booktitle = {IEEE/CVF International Conference on Computer Vision},
year = {1999},
pages = {1237-1244},
doi = {10.1109/ICCV.1999.790422},
url = {https://mlanthology.org/iccv/1999/brand1999iccv-shadow/}
}