Efficient Hand Pose Estimation from a Single Depth Image

Abstract

We tackle the practical problem of hand pose estimation from a single noisy depth image. A dedicated three-step pipeline is proposed: Initial estimation step provides an initial estimation of the hand in-plane orientation and 3D location; Candidate generation step produces a set of 3D pose candidate from the Hough voting space with the help of the rotational invariant depth features; Verification step delivers the final 3D hand pose as the solution to an optimization problem. We analyze the depth noises, and suggest tips to minimize their negative impacts on the overall performance. Our approach is able to work with Kinecttype noisy depth images, and reliably produces pose estimations of general motions efficiently (12 frames per second). Extensive experiments are conducted to qualitatively and quantitatively evaluate the performance with respect to the state-of-the-art methods that have access to additional RGB images. Our approach is shown to deliver on par or even better results.

Cite

Text

Xu and Cheng. "Efficient Hand Pose Estimation from a Single Depth Image." International Conference on Computer Vision, 2013. doi:10.1109/ICCV.2013.429

Markdown

[Xu and Cheng. "Efficient Hand Pose Estimation from a Single Depth Image." International Conference on Computer Vision, 2013.](https://mlanthology.org/iccv/2013/xu2013iccv-efficient/) doi:10.1109/ICCV.2013.429

BibTeX

@inproceedings{xu2013iccv-efficient,
  title     = {{Efficient Hand Pose Estimation from a Single Depth Image}},
  author    = {Xu, Chi and Cheng, Li},
  booktitle = {International Conference on Computer Vision},
  year      = {2013},
  doi       = {10.1109/ICCV.2013.429},
  url       = {https://mlanthology.org/iccv/2013/xu2013iccv-efficient/}
}