Graph Stacked Hourglass Networks for 3D Human Pose Estimation

Abstract

In this paper, we propose a novel graph convolutional network architecture, Graph Stacked Hourglass Networks, for 2D-to-3D human pose estimation tasks. The proposed architecture consists of repeated encoder-decoder, in which graph-structured features are processed across three different scales of human skeletal representations. This multi-scale architecture enables the model to learn both local and global feature representations, which are critical for 3D human pose estimation. We also introduce a multi-level feature learning approach using different-depth intermediate features and show the performance improvements that result from exploiting multi-scale, multi-level feature representations. Extensive experiments are conducted to validate our approach, and the results show that our model outperforms the state-of-the-art.

Cite

Text

Xu and Takano. "Graph Stacked Hourglass Networks for 3D Human Pose Estimation." Conference on Computer Vision and Pattern Recognition, 2021. doi:10.1109/CVPR46437.2021.01584

Markdown

[Xu and Takano. "Graph Stacked Hourglass Networks for 3D Human Pose Estimation." Conference on Computer Vision and Pattern Recognition, 2021.](https://mlanthology.org/cvpr/2021/xu2021cvpr-graph/) doi:10.1109/CVPR46437.2021.01584

BibTeX

@inproceedings{xu2021cvpr-graph,
  title     = {{Graph Stacked Hourglass Networks for 3D Human Pose Estimation}},
  author    = {Xu, Tianhan and Takano, Wataru},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2021},
  pages     = {16105-16114},
  doi       = {10.1109/CVPR46437.2021.01584},
  url       = {https://mlanthology.org/cvpr/2021/xu2021cvpr-graph/}
}