A Two-Stage Detector for Hand Detection in Ego-Centric Videos

Abstract

We propose a two-stage detector that can not only detect and localize hands, but also provide fine-detailed information in the bounding box of hand in an efficient fashion. In the first stage, hand bounding box proposals are generated from a pixel-level hand probability map. Next, each hand proposal is evaluated by a Multi-task Convolutional Neural Network to filter out false positives and obtain fine shape and landmark information. Through experiments, we demonstrate that our method is efficient and robust to detect hands with their shape and landmark information, and our system can also be flexibly combined with other detection methods to handle a new scene. Further experiment shows that our Multi-task CNN can also be extended to hand gesture classification with a large performance increase.

Cite

Text

Zhu et al. "A Two-Stage Detector for Hand Detection in Ego-Centric Videos." IEEE/CVF Winter Conference on Applications of Computer Vision, 2016. doi:10.1109/WACV.2016.7477665

Markdown

[Zhu et al. "A Two-Stage Detector for Hand Detection in Ego-Centric Videos." IEEE/CVF Winter Conference on Applications of Computer Vision, 2016.](https://mlanthology.org/wacv/2016/zhu2016wacv-two/) doi:10.1109/WACV.2016.7477665

BibTeX

@inproceedings{zhu2016wacv-two,
  title     = {{A Two-Stage Detector for Hand Detection in Ego-Centric Videos}},
  author    = {Zhu, Xiaolong and Liu, Wei and Jia, Xuhui and Wong, Kwan-Yee K.},
  booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
  year      = {2016},
  pages     = {1-8},
  doi       = {10.1109/WACV.2016.7477665},
  url       = {https://mlanthology.org/wacv/2016/zhu2016wacv-two/}
}