A Pose-Invariant Descriptor for Human Detection and Segmentation

Abstract

We present a learning-based, sliding window-style approach for the problem of detecting humans in still images. Instead of traditional concatenation-style image location-based feature encoding, a global descriptor more invariant to pose variation is introduced. Specifically, we propose a principled approach to learning and classifying human/non-human image patterns by simultaneously segmenting human shapes and poses, and extracting articulation-insensitive features. The shapes and poses are segmented by an efficient, probabilistic hierarchical part-template matching algorithm, and the features are collected in the context of poses by tracing around the estimated shape boundaries. Histograms of oriented gradients are used as a source of low-level features from which our pose-invariant descriptors are computed, and kernel SVMs are adopted as the test classifiers. We evaluate our detection and segmentation approach on two public pedestrian datasets.

Cite

Text

Lin and Davis. "A Pose-Invariant Descriptor for Human Detection and Segmentation." European Conference on Computer Vision, 2008. doi:10.1007/978-3-540-88693-8_31

Markdown

[Lin and Davis. "A Pose-Invariant Descriptor for Human Detection and Segmentation." European Conference on Computer Vision, 2008.](https://mlanthology.org/eccv/2008/lin2008eccv-pose/) doi:10.1007/978-3-540-88693-8_31

BibTeX

@inproceedings{lin2008eccv-pose,
  title     = {{A Pose-Invariant Descriptor for Human Detection and Segmentation}},
  author    = {Lin, Zhe and Davis, Larry S.},
  booktitle = {European Conference on Computer Vision},
  year      = {2008},
  pages     = {423-436},
  doi       = {10.1007/978-3-540-88693-8_31},
  url       = {https://mlanthology.org/eccv/2008/lin2008eccv-pose/}
}