Maximum-Margin Structured Learning with Deep Networks for 3D Human Pose Estimation

Abstract

This paper focuses on structured-output learning using deep neural networks for 3D human pose estimation from monocular images. Our network takes an image and 3D pose as inputs and outputs a score value, which is high when the image-pose pair matches and low otherwise. The network structure consists of a convolutional neural network for image feature extraction, followed by two sub-networks for transforming the image features and pose into a joint embedding. The score function is then the dot-product between the image and pose embeddings. The image-pose embedding and score function are jointly trained using a maximum-margin cost function. Our proposed framework can be interpreted as a special form of structured support vector machines where the joint feature space is discriminatively learned using deep neural networks. We test our framework on the Human3.6m dataset and obtain state-of-the-art results compared to other recent methods. Finally, we present visualizations of the image-pose embedding space, demonstrating the network has learned a high-level embedding of body-orientation and pose-configuration.

Cite

Text

Li et al. "Maximum-Margin Structured Learning with Deep Networks for 3D Human Pose Estimation." International Conference on Computer Vision, 2015. doi:10.1109/ICCV.2015.326

Markdown

[Li et al. "Maximum-Margin Structured Learning with Deep Networks for 3D Human Pose Estimation." International Conference on Computer Vision, 2015.](https://mlanthology.org/iccv/2015/li2015iccv-maximummargin/) doi:10.1109/ICCV.2015.326

BibTeX

@inproceedings{li2015iccv-maximummargin,
  title     = {{Maximum-Margin Structured Learning with Deep Networks for 3D Human Pose Estimation}},
  author    = {Li, Sijin and Zhang, Weichen and Chan, Antoni B.},
  booktitle = {International Conference on Computer Vision},
  year      = {2015},
  doi       = {10.1109/ICCV.2015.326},
  url       = {https://mlanthology.org/iccv/2015/li2015iccv-maximummargin/}
}