SemiHand: Semi-Supervised Hand Pose Estimation with Consistency

Abstract

We present SemiHand, a semi-supervised framework for 3D hand pose estimation from monocular images. We pre-train the model on labelled synthetic data and fine-tune it on unlabelled real-world data by pseudo-labeling with consistency training. By design, we introduce data augmentation of differing difficulties, consistency regularizer, label correction and sample selection for RGB-based 3D hand pose estimation. In particular, by approximating the hand masks from hand poses, we propose a cross-modal consistency and leverage semantic predictions to guide the predicted poses. Meanwhile, we introduce pose registration as label correction to guarantee the biomechanical feasibility of hand bone lengths. Experiments show that our method achieves a favorable improvement on real-world datasets after fine-tuning.

Cite

Text

Yang et al. "SemiHand: Semi-Supervised Hand Pose Estimation with Consistency." International Conference on Computer Vision, 2021. doi:10.1109/ICCV48922.2021.01117

Markdown

[Yang et al. "SemiHand: Semi-Supervised Hand Pose Estimation with Consistency." International Conference on Computer Vision, 2021.](https://mlanthology.org/iccv/2021/yang2021iccv-semihand/) doi:10.1109/ICCV48922.2021.01117

BibTeX

@inproceedings{yang2021iccv-semihand,
  title     = {{SemiHand: Semi-Supervised Hand Pose Estimation with Consistency}},
  author    = {Yang, Linlin and Chen, Shicheng and Yao, Angela},
  booktitle = {International Conference on Computer Vision},
  year      = {2021},
  pages     = {11364-11373},
  doi       = {10.1109/ICCV48922.2021.01117},
  url       = {https://mlanthology.org/iccv/2021/yang2021iccv-semihand/}
}