Continuous Hand Gesture Recognition for Human-Robot Collaborative Assembly

Abstract

In this work, we present a framework for dynamic hand gesture recognition on RGB images acquired by an overhead camera. The recognition is realized for Methods Time Measurement-based planning of human-robot collaborative workspace. The 3D hand posture is estimated by MediaPipe. The recognition is done by a neural network in which a layer-wise feature combination takes place. We combine features extracted by basic blocks of Spatio-Temporal Adaptive Graph Convolutional Neural Network and by basic spatio-temporal self-attention blocks. We recorded and manually annotated 12 videos consisting of 54,659 RGB images with five basic motion sequences: grasp, move, position, release, and reach. We demonstrate experimentally that results of our networks are superior to results achieved by RNNs, ST-GCN, ST-AGCN, and CTR-GCN networks.

Cite

Text

Kwolek. "Continuous Hand Gesture Recognition for Human-Robot Collaborative Assembly." IEEE/CVF International Conference on Computer Vision Workshops, 2023. doi:10.1109/ICCVW60793.2023.00214

Markdown

[Kwolek. "Continuous Hand Gesture Recognition for Human-Robot Collaborative Assembly." IEEE/CVF International Conference on Computer Vision Workshops, 2023.](https://mlanthology.org/iccvw/2023/kwolek2023iccvw-continuous/) doi:10.1109/ICCVW60793.2023.00214

BibTeX

@inproceedings{kwolek2023iccvw-continuous,
  title     = {{Continuous Hand Gesture Recognition for Human-Robot Collaborative Assembly}},
  author    = {Kwolek, Bogdan},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
  year      = {2023},
  pages     = {1992-1999},
  doi       = {10.1109/ICCVW60793.2023.00214},
  url       = {https://mlanthology.org/iccvw/2023/kwolek2023iccvw-continuous/}
}