Temporal Classification of Natural Gesture and Application to Video Coding
Abstract
A method for the temporal classification of natural gesture from video imagery is presented. The work is motivated by recent developments in the theory of natural gesture which have identified several key temporal aspects of gesture important to communication. In particular gesticulation during conversation can be coarsely characterized as periods of bi-phasic or tri-phasic gesture separated by a rest state. We first present an automatic procedure for hypothesizing plausible rest state configurations of a speaker. Second, we develop a state-based parsing algorithm used to both select among candidate rest states and to parse an incoming video stream into bi-phasic and tri-phasic gestures. Finally, we demonstrate the use of the bi-phasic/tri-phasic labeling to select semantically significant static images for low bandwidth coding of video of story-telling speakers.
Cite
Text
Wilson et al. "Temporal Classification of Natural Gesture and Application to Video Coding." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1997. doi:10.1109/CVPR.1997.609442Markdown
[Wilson et al. "Temporal Classification of Natural Gesture and Application to Video Coding." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1997.](https://mlanthology.org/cvpr/1997/wilson1997cvpr-temporal/) doi:10.1109/CVPR.1997.609442BibTeX
@inproceedings{wilson1997cvpr-temporal,
title = {{Temporal Classification of Natural Gesture and Application to Video Coding}},
author = {Wilson, Andrew D. and Bobick, Aaron F. and Cassell, Justine},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year = {1997},
pages = {948-954},
doi = {10.1109/CVPR.1997.609442},
url = {https://mlanthology.org/cvpr/1997/wilson1997cvpr-temporal/}
}