Phonologically-Meaningful Subunits for Deep Learning-Based Sign Language Recognition

Abstract

The large majority of sign language recognition systems based on deep learning adopt a word model approach. Here we present a system that works with subunits, rather than word models. We propose a pipelined approach to deep learning that uses a factorisation algorithm to derive hand motion features, embedded within a low-rank trajectory space. Recurrent neural networks are then trained on these embedded features for subunit recognition, followed by a second-stage neural network for sign recognition. Our evaluation shows that our proposed solution compares well in accuracy against the state of the art, providing added benefits of better interpretability and phonologically-meaningful subunits that can operate across different signers and sign languages.

Cite

Text

Borg and Camilleri. "Phonologically-Meaningful Subunits for Deep Learning-Based Sign Language Recognition." European Conference on Computer Vision Workshops, 2020. doi:10.1007/978-3-030-66096-3_15

Markdown

[Borg and Camilleri. "Phonologically-Meaningful Subunits for Deep Learning-Based Sign Language Recognition." European Conference on Computer Vision Workshops, 2020.](https://mlanthology.org/eccvw/2020/borg2020eccvw-phonologicallymeaningful/) doi:10.1007/978-3-030-66096-3_15

BibTeX

@inproceedings{borg2020eccvw-phonologicallymeaningful,
  title     = {{Phonologically-Meaningful Subunits for Deep Learning-Based Sign Language Recognition}},
  author    = {Borg, Mark and Camilleri, Kenneth P.},
  booktitle = {European Conference on Computer Vision Workshops},
  year      = {2020},
  pages     = {199-217},
  doi       = {10.1007/978-3-030-66096-3_15},
  url       = {https://mlanthology.org/eccvw/2020/borg2020eccvw-phonologicallymeaningful/}
}