Multi-Source Deep Learning for Human Pose Estimation
Abstract
Visual appearance score, appearance mixture type and deformation are three important information sources for human pose estimation. This paper proposes to build a multi-source deep model in order to extract non-linear representation from these different aspects of information sources. With the deep model, the global, high-order human body articulation patterns in these information sources are extracted for pose estimation. The task for estimating body locations and the task for human detection are jointly learned using a unified deep model. The proposed approach can be viewed as a post-processing of pose estimation results and can flexibly integrate with existing methods by taking their information sources as input. By extracting the non-linear representation from multiple information sources, the deep model outperforms state-of-the-art by up to 8.6 percent on three public benchmark datasets.
Cite
Text
Ouyang et al. "Multi-Source Deep Learning for Human Pose Estimation." Conference on Computer Vision and Pattern Recognition, 2014. doi:10.1109/CVPR.2014.299Markdown
[Ouyang et al. "Multi-Source Deep Learning for Human Pose Estimation." Conference on Computer Vision and Pattern Recognition, 2014.](https://mlanthology.org/cvpr/2014/ouyang2014cvpr-multisource/) doi:10.1109/CVPR.2014.299BibTeX
@inproceedings{ouyang2014cvpr-multisource,
title = {{Multi-Source Deep Learning for Human Pose Estimation}},
author = {Ouyang, Wanli and Chu, Xiao and Wang, Xiaogang},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2014},
doi = {10.1109/CVPR.2014.299},
url = {https://mlanthology.org/cvpr/2014/ouyang2014cvpr-multisource/}
}