Contextual Attention for Hand Detection in the Wild

Abstract

We present Hand-CNN, a novel convolutional network architecture for detecting hand masks and predicting hand orientations in unconstrained images. Hand-CNN extends MaskRCNN with a novel attention mechanism to incorporate contextual cues in the detection process. This attention mechanism can be implemented as an efficient network module that captures non-local dependencies between features. This network module can be inserted at different stages of an object detection network, and the entire detector can be trained end-to-end. We also introduce large-scale annotated hand datasets containing hands in unconstrained images for training and evaluation. We show that Hand-CNN outperforms existing methods on the newly collected datasets and the publicly available PASCAL VOC human layout dataset. Data and code: https://www3.cs.stonybrook.edu/ cvl/projects/hand_det_attention/

Cite

Text

Narasimhaswamy et al. "Contextual Attention for Hand Detection in the Wild." Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019. doi:10.1109/ICCV.2019.00966

Markdown

[Narasimhaswamy et al. "Contextual Attention for Hand Detection in the Wild." Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019.](https://mlanthology.org/iccv/2019/narasimhaswamy2019iccv-contextual/) doi:10.1109/ICCV.2019.00966

BibTeX

@inproceedings{narasimhaswamy2019iccv-contextual,
  title     = {{Contextual Attention for Hand Detection in the Wild}},
  author    = {Narasimhaswamy, Supreeth and Wei, Zhengwei and Wang, Yang and Zhang, Justin and Hoai, Minh},
  booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year      = {2019},
  doi       = {10.1109/ICCV.2019.00966},
  url       = {https://mlanthology.org/iccv/2019/narasimhaswamy2019iccv-contextual/}
}