Integrating Grammar and Segmentation for Human Pose Estimation
Abstract
In this paper we present a compositional and-or graph grammar model for human pose estimation. Our model has three distinguishing features: (i) large appearance differences between people are handled compositionally by allowing parts or collections of parts to be substituted with alternative variants, (ii) each variant is a sub-model that can define its own articulated geometry and context-sensitive compatibility with neighboring part variants, and (iii) background region segmentation is incorporated into the part appearance models to better estimate the contrast of a part region from its surroundings, and improve resilience to background clutter. The resulting integrated framework is trained discriminatively in a max-margin framework using an efficient and exact inference algorithm. We present experimental evaluation of our model on two popular datasets, and show performance improvements over the state-of-art on both benchmarks.
Cite
Text
Rothrock et al. "Integrating Grammar and Segmentation for Human Pose Estimation." Conference on Computer Vision and Pattern Recognition, 2013. doi:10.1109/CVPR.2013.413Markdown
[Rothrock et al. "Integrating Grammar and Segmentation for Human Pose Estimation." Conference on Computer Vision and Pattern Recognition, 2013.](https://mlanthology.org/cvpr/2013/rothrock2013cvpr-integrating/) doi:10.1109/CVPR.2013.413BibTeX
@inproceedings{rothrock2013cvpr-integrating,
title = {{Integrating Grammar and Segmentation for Human Pose Estimation}},
author = {Rothrock, Brandon and Park, Seyoung and Zhu, Song-Chun},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2013},
doi = {10.1109/CVPR.2013.413},
url = {https://mlanthology.org/cvpr/2013/rothrock2013cvpr-integrating/}
}