Cross Transferring Activity Recognition to Word Level Sign Language Detection
Abstract
The lack of large scale labelled datasets in word-level sign language recognition (WSLR) poses a challenge to detecting sign language from videos. Most WSLR approaches operate on datasets that do not model real-world settings very well, as they do not have a high degree of variability in terms of signers, background, lighting and inter signer variation. We chose the MS-ASL dataset to overcome these limitations as they model open-world settings very well. This paper benchmarks successful action recognition architectures on the MS-ASL dataset using transfer learning. We have achieved new state-of-the-art accuracy (92.35%) with an improvement of 7.03% over the previous state-of-the-art introduced by the MS-ASL paper. We have analyzed how action-recognition architectures fair in the task of WSLR, and we propose SlowFast 8×8 ResNet 101 as a robust and suitable architecture for the task of WSLR.
Cite
Text
Radhakrishnan et al. "Cross Transferring Activity Recognition to Word Level Sign Language Detection." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022. doi:10.1109/CVPRW56347.2022.00273Markdown
[Radhakrishnan et al. "Cross Transferring Activity Recognition to Word Level Sign Language Detection." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022.](https://mlanthology.org/cvprw/2022/radhakrishnan2022cvprw-cross/) doi:10.1109/CVPRW56347.2022.00273BibTeX
@inproceedings{radhakrishnan2022cvprw-cross,
title = {{Cross Transferring Activity Recognition to Word Level Sign Language Detection}},
author = {Radhakrishnan, Srijith and Mohan, Nikhil C. and Varma, Manisimha and Varma, Jaithra and Pai, Smitha N.},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2022},
pages = {2445-2452},
doi = {10.1109/CVPRW56347.2022.00273},
url = {https://mlanthology.org/cvprw/2022/radhakrishnan2022cvprw-cross/}
}