Extracting Moving People from Internet Videos

Abstract

We propose a fully automatic framework to detect and extract arbitrary human motion volumes from real-world videos collected from YouTube . Our system is composed of two stages. A person detector is first applied to provide crude information about the possible locations of humans. Then a constrained clustering algorithm groups the detections and rejects false positives based on the appearance similarity and spatio-temporal coherence. In the second stage, we apply a top-down pictorial structure model to complete the extraction of the humans in arbitrary motion. During this procedure, a density propagation technique based on a mixture of Gaussians is employed to propagate temporal information in a principled way. This method reduces greatly the search space for the measurement in the inference stage. We demonstrate the initial success of this framework both quantitatively and qualitatively by using a number of YouTube videos.

Cite

Text

Niebles et al. "Extracting Moving People from Internet Videos." European Conference on Computer Vision, 2008. doi:10.1007/978-3-540-88693-8_39

Markdown

[Niebles et al. "Extracting Moving People from Internet Videos." European Conference on Computer Vision, 2008.](https://mlanthology.org/eccv/2008/niebles2008eccv-extracting/) doi:10.1007/978-3-540-88693-8_39

BibTeX

@inproceedings{niebles2008eccv-extracting,
  title     = {{Extracting Moving People from Internet Videos}},
  author    = {Niebles, Juan Carlos and Han, Bohyung and Ferencz, Andras and Fei-Fei, Li},
  booktitle = {European Conference on Computer Vision},
  year      = {2008},
  pages     = {527-540},
  doi       = {10.1007/978-3-540-88693-8_39},
  url       = {https://mlanthology.org/eccv/2008/niebles2008eccv-extracting/}
}