Macro-Cuboïd Based Probabilistic Matching for Lip-Reading Digits
Abstract
In this paper, we present a spatio-temporal feature representation and a probabilistic matching function to recognise lip movements from pronounced digits. Our model (1) automatically selects spatio-temporal features extracted from 10 digit model templates and (2) matches them with probe video sequences. Spatio-temporal features embed lip movements from pronouncing digits and contain more discriminative information than spatial features alone. A model template for each digit is represented by a set of spatio-temporal features at multiple scales. A probabilistic sequence matching function automatically segments a probe video sequence and matches the most likely sequence of digits recognised in the probe sequence. We demonstrate the proposed approach using the CUAVE database and compare our representational scheme with three alternative methods, based on optical flow, intensity gradient and block matching, respectively. The evaluation shows that the proposed approach outperforms the others in recognition accuracy and is robust in coping with variations in probe sequences.
Cite
Text
Pachoud et al. "Macro-Cuboïd Based Probabilistic Matching for Lip-Reading Digits." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2008. doi:10.1109/CVPR.2008.4587734Markdown
[Pachoud et al. "Macro-Cuboïd Based Probabilistic Matching for Lip-Reading Digits." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2008.](https://mlanthology.org/cvpr/2008/pachoud2008cvpr-macro/) doi:10.1109/CVPR.2008.4587734BibTeX
@inproceedings{pachoud2008cvpr-macro,
title = {{Macro-Cuboïd Based Probabilistic Matching for Lip-Reading Digits}},
author = {Pachoud, Samuel and Gong, Shaogang and Cavallaro, Andrea},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year = {2008},
doi = {10.1109/CVPR.2008.4587734},
url = {https://mlanthology.org/cvpr/2008/pachoud2008cvpr-macro/}
}