View-Invariant Modeling and Recognition of Human Actions Using Grammars
Abstract
In this paper, we represent human actions as sentences generated by a language built on atomic body poses or phonemes. The knowledge of body pose is stored only implicitly as a set of silhouettes seen from multiple viewpoints; no explicit 3D poses or body models are used, and individual body parts are not identified. Actions and their constituent atomic poses are extracted from a set of multiview multiperson video sequences by an automatic keyframe selection process, and are used to automatically construct a probabilistic context-free grammar (PCFG), which encodes the syntax of the actions. Given a new single viewpoint video, we can parse it to recognize actions and changes in viewpoint simultaneously. Experimental results are provided.
Cite
Text
Ogale et al. "View-Invariant Modeling and Recognition of Human Actions Using Grammars." European Conference on Computer Vision, 2006. doi:10.1007/978-3-540-70932-9_9Markdown
[Ogale et al. "View-Invariant Modeling and Recognition of Human Actions Using Grammars." European Conference on Computer Vision, 2006.](https://mlanthology.org/eccv/2006/ogale2006eccv-view/) doi:10.1007/978-3-540-70932-9_9BibTeX
@inproceedings{ogale2006eccv-view,
title = {{View-Invariant Modeling and Recognition of Human Actions Using Grammars}},
author = {Ogale, Abhijit S. and Karapurkar, Alap and Aloimonos, Yiannis},
booktitle = {European Conference on Computer Vision},
year = {2006},
pages = {115-126},
doi = {10.1007/978-3-540-70932-9_9},
url = {https://mlanthology.org/eccv/2006/ogale2006eccv-view/}
}