Phonologically-Meaningful Subunits for Deep Learning-Based Sign Language Recognition
Abstract
The large majority of sign language recognition systems based on deep learning adopt a word model approach. Here we present a system that works with subunits, rather than word models. We propose a pipelined approach to deep learning that uses a factorisation algorithm to derive hand motion features, embedded within a low-rank trajectory space. Recurrent neural networks are then trained on these embedded features for subunit recognition, followed by a second-stage neural network for sign recognition. Our evaluation shows that our proposed solution compares well in accuracy against the state of the art, providing added benefits of better interpretability and phonologically-meaningful subunits that can operate across different signers and sign languages.
Cite
Text
Borg and Camilleri. "Phonologically-Meaningful Subunits for Deep Learning-Based Sign Language Recognition." European Conference on Computer Vision Workshops, 2020. doi:10.1007/978-3-030-66096-3_15Markdown
[Borg and Camilleri. "Phonologically-Meaningful Subunits for Deep Learning-Based Sign Language Recognition." European Conference on Computer Vision Workshops, 2020.](https://mlanthology.org/eccvw/2020/borg2020eccvw-phonologicallymeaningful/) doi:10.1007/978-3-030-66096-3_15BibTeX
@inproceedings{borg2020eccvw-phonologicallymeaningful,
title = {{Phonologically-Meaningful Subunits for Deep Learning-Based Sign Language Recognition}},
author = {Borg, Mark and Camilleri, Kenneth P.},
booktitle = {European Conference on Computer Vision Workshops},
year = {2020},
pages = {199-217},
doi = {10.1007/978-3-030-66096-3_15},
url = {https://mlanthology.org/eccvw/2020/borg2020eccvw-phonologicallymeaningful/}
}