SCAT: Stride Consistency with Auto-Regressive Regressor and Transformer for Hand Pose Estimation

Abstract

The current state-of-the-art monocular 3D hand pose estimation methods are mostly model-based. For instance, MANO is one of the most popular hand parametric models, which can depict hand shapes and poses. It is widely adopted for estimating hand poses in images and videos. However, MANO is a parametric model derived from scanned hand data with limited shapes and poses which constrains its capability in depicting in-the-wild shape and pose variations. In this paper, we propose a 3D hand pose estimation approach which does not depends on any parametric hand models yet can still accurately estimate in-the-wild hand poses. Our approach (Stride Consistency with Autoregressive regressor and Transformer, SCAT) offers a new representation for measuring hand poses. The new representation includes a mean shape hand template and its 21 hand joint offsets depicting the 3D distances between the hand template and the hand that needs to be estimated. Besides, SCAT can generate a robust and smooth linear mapping between visual feature maps and the target 3D off-sets, ensuring inter-frame smoothness and removing motion jittering. We also introduce an auto-regressive refinement procedure for iteratively refining the hand pose estimation. Extensive experiments show that our SCAT can generate more accurate and smoother 3D hand pose estimation results compared with the state-of-the-art methods.

Cite

Text

Gao et al. "SCAT: Stride Consistency with Auto-Regressive Regressor and Transformer for Hand Pose Estimation." IEEE/CVF International Conference on Computer Vision Workshops, 2021. doi:10.1109/ICCVW54120.2021.00256

Markdown

[Gao et al. "SCAT: Stride Consistency with Auto-Regressive Regressor and Transformer for Hand Pose Estimation." IEEE/CVF International Conference on Computer Vision Workshops, 2021.](https://mlanthology.org/iccvw/2021/gao2021iccvw-scat/) doi:10.1109/ICCVW54120.2021.00256

BibTeX

@inproceedings{gao2021iccvw-scat,
  title     = {{SCAT: Stride Consistency with Auto-Regressive Regressor and Transformer for Hand Pose Estimation}},
  author    = {Gao, Daiheng and Zhang, Bang and Wang, Qi and Zhang, Xindi and Pan, Pan and Xu, Yinghui},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
  year      = {2021},
  pages     = {2266-2275},
  doi       = {10.1109/ICCVW54120.2021.00256},
  url       = {https://mlanthology.org/iccvw/2021/gao2021iccvw-scat/}
}