Recognition from Hand Cameras: A Revisit with Deep Learning

Chan, Cheng-Sheng; Chen, Shou-Zhong; Xie, Pei-Xuan; Chang, Chiung-Chih; Sun, Min

doi:10.1007/978-3-319-46493-0_31

Recognition from Hand Cameras: A Revisit with Deep Learning

Cheng-Sheng Chan, Shou-Zhong Chen, Pei-Xuan Xie, Chiung-Chih Chang, Min Sun

ECCV 2016 pp. 505-521

doi:10.1007/978-3-319-46493-0_31 /eccv/2016/chan2016eccv-recognition/

Abstract

We revisit the study of a wrist-mounted camera system (referred to as HandCam) for recognizing activities of hands. HandCam has two unique properties as compared to egocentric systems (referred to as HeadCam): (1) it avoids the need to detect hands; (2) it more consistently observes the activities of hands. By taking advantage of these properties, we propose a deep-learning-based method to recognize hand states (free vs. active hands, hand gestures, object categories), and discover object categories. Moreover, we propose a novel two-streams deep network to further take advantage of both HandCam and HeadCam. We have collected a new synchronized HandCam and HeadCam dataset with 20 videos captured in three scenes for hand states recognition. Experiments show that our HandCam system consistently outperforms a deep-learning-based HeadCam method (with estimated manipulation regions) and a dense-trajectory-based HeadCam method in all tasks. We also show that HandCam videos captured by different users can be easily aligned to improve free vs. active recognition accuracy ( $3.3\,\%$ 3.3 % improvement) in across-scenes use case. Moreover, we observe that finetuning Convolutional Neural Network consistently improves accuracy. Finally, our novel two-streams deep network combining HandCam and HeadCam achieves the best performance in four out of five tasks. With more data, we believe a joint HandCam and HeadCam system can robustly log hand states in daily life.

PDF ECCV Semantic Scholar

Cite

Text

Chan et al. "Recognition from Hand Cameras: A Revisit with Deep Learning." European Conference on Computer Vision, 2016. doi:10.1007/978-3-319-46493-0_31

Markdown

[Chan et al. "Recognition from Hand Cameras: A Revisit with Deep Learning." European Conference on Computer Vision, 2016.](https://mlanthology.org/eccv/2016/chan2016eccv-recognition/) doi:10.1007/978-3-319-46493-0_31

BibTeX

@inproceedings{chan2016eccv-recognition,
  title     = {{Recognition from Hand Cameras: A Revisit with Deep Learning}},
  author    = {Chan, Cheng-Sheng and Chen, Shou-Zhong and Xie, Pei-Xuan and Chang, Chiung-Chih and Sun, Min},
  booktitle = {European Conference on Computer Vision},
  year      = {2016},
  pages     = {505-521},
  doi       = {10.1007/978-3-319-46493-0_31},
  url       = {https://mlanthology.org/eccv/2016/chan2016eccv-recognition/}
}