Monocular RGB Hand Pose Inference from Unsupervised Refinable Nets

Abstract

3D hand pose inference from monocular RGB data is a challenging problem. CNN-based approaches have shown great promise in tackling this problem. However, such approaches are data-hungry, and obtaining real labeled training hand data is very hard. To overcome this, in this work, we propose a new, large, realistically rendered hand dataset and a neural network trained on it, with the ability to refine itself unsupervised on real unlabeled RGB images, given corresponding depth images. We benchmark and validate our method on existing and captured datasets, demonstrating that we strongly compare to or outperform state-of-the-art methods for various tasks ranging from 3D pose estimation to hand gesture recognition.

Cite

Text

Dibra et al. "Monocular RGB Hand Pose Inference from Unsupervised Refinable Nets." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2018. doi:10.1109/CVPRW.2018.00155

Markdown

[Dibra et al. "Monocular RGB Hand Pose Inference from Unsupervised Refinable Nets." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2018.](https://mlanthology.org/cvprw/2018/dibra2018cvprw-monocular/) doi:10.1109/CVPRW.2018.00155

BibTeX

@inproceedings{dibra2018cvprw-monocular,
  title     = {{Monocular RGB Hand Pose Inference from Unsupervised Refinable Nets}},
  author    = {Dibra, Endri and Melchior, Silvan and Balkis, Ali and Wolf, Thomas and Öztireli, Cengiz and Gross, Markus H.},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2018},
  pages     = {1075-1085},
  doi       = {10.1109/CVPRW.2018.00155},
  url       = {https://mlanthology.org/cvprw/2018/dibra2018cvprw-monocular/}
}