Robust 3D Hand Pose Estimation in Single Depth Images: From Single-View CNN to Multi-View CNNs

Abstract

Articulated hand pose estimation plays an important role in human-computer interaction. Despite the recent progress, the accuracy of existing methods is still not satisfactory, partially due to the difficulty of embedded high-dimensional and non-linear regression problem. Different from the existing discriminative methods that regress for the hand pose with a single depth image, we propose to first project the query depth image onto three orthogonal planes and utilize these multi-view projections to regress for 2D heat-maps which estimate the joint positions on each plane. These multi-view heat-maps are then fused to produce final 3D hand pose estimation with learned pose priors. Experiments show that the proposed method largely outperforms state-of-the-arts on a challenging dataset. Moreover, a cross-dataset experiment also demonstrates the good generalization ability of the proposed method.

Cite

Text

Ge et al. "Robust 3D Hand Pose Estimation in Single Depth Images: From Single-View CNN to Multi-View CNNs." Conference on Computer Vision and Pattern Recognition, 2016. doi:10.1109/CVPR.2016.391

Markdown

[Ge et al. "Robust 3D Hand Pose Estimation in Single Depth Images: From Single-View CNN to Multi-View CNNs." Conference on Computer Vision and Pattern Recognition, 2016.](https://mlanthology.org/cvpr/2016/ge2016cvpr-robust/) doi:10.1109/CVPR.2016.391

BibTeX

@inproceedings{ge2016cvpr-robust,
  title     = {{Robust 3D Hand Pose Estimation in Single Depth Images: From Single-View CNN to Multi-View CNNs}},
  author    = {Ge, Liuhao and Liang, Hui and Yuan, Junsong and Thalmann, Daniel},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2016},
  doi       = {10.1109/CVPR.2016.391},
  url       = {https://mlanthology.org/cvpr/2016/ge2016cvpr-robust/}
}