Bridging the Gap Between Detection and Tracking for 3D Monocular Video-Based Motion Capture
Abstract
We combine detection and tracking techniques to achieve robust 3-D motion recovery of people seen from arbitrary viewpoints by a single and potentially moving camera. We rely on detecting key postures, which can be done reliably, using a motion model to infer 3-D poses between consecutive detections, and finally refining them over the whole sequence using a generative model. We demonstrate our approach in the case of people walking against cluttered backgrounds and filmed using a moving camera, which precludes the use of simple background subtraction techniques. In this case, the easy-to-detect posture is the one that occurs at the end of each step when people have their legs furthest apart.
Cite
Text
Fossati et al. "Bridging the Gap Between Detection and Tracking for 3D Monocular Video-Based Motion Capture." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2007. doi:10.1109/CVPR.2007.383297Markdown
[Fossati et al. "Bridging the Gap Between Detection and Tracking for 3D Monocular Video-Based Motion Capture." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2007.](https://mlanthology.org/cvpr/2007/fossati2007cvpr-bridging/) doi:10.1109/CVPR.2007.383297BibTeX
@inproceedings{fossati2007cvpr-bridging,
title = {{Bridging the Gap Between Detection and Tracking for 3D Monocular Video-Based Motion Capture}},
author = {Fossati, Andrea and Dimitrijevic, Miodrag and Lepetit, Vincent and Fua, Pascal},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year = {2007},
doi = {10.1109/CVPR.2007.383297},
url = {https://mlanthology.org/cvpr/2007/fossati2007cvpr-bridging/}
}