Spatio-Temporal Covariance Descriptors for Action and Gesture Recognition

Abstract

We propose a new action and gesture recognition method based on spatio-temporal covariance descriptors and a weighted Riemannian locality preserving projection approach that takes into account the curved space formed by the descriptors. The weighted projection is then exploited during boosting to create a final multiclass classification algorithm that employs the most useful spatio-temporal regions. We also show how the descriptors can be computed quickly through the use of integral video representations. Experiments on the UCF sport, CK+ facial expression and Cambridge hand gesture datasets indicate superior performance of the proposed method compared to several recent state-of-the-art techniques. The proposed method is robust and does not require additional processing of the videos, such as foreground detection, interest-point detection or tracking.

Cite

Text

Sanin et al. "Spatio-Temporal Covariance Descriptors for Action and Gesture Recognition." IEEE/CVF Winter Conference on Applications of Computer Vision, 2013. doi:10.1109/WACV.2013.6475006

Markdown

[Sanin et al. "Spatio-Temporal Covariance Descriptors for Action and Gesture Recognition." IEEE/CVF Winter Conference on Applications of Computer Vision, 2013.](https://mlanthology.org/wacv/2013/sanin2013wacv-spatio/) doi:10.1109/WACV.2013.6475006

BibTeX

@inproceedings{sanin2013wacv-spatio,
  title     = {{Spatio-Temporal Covariance Descriptors for Action and Gesture Recognition}},
  author    = {Sanin, Andres and Sanderson, Conrad and Harandi, Mehrtash Tafazzoli and Lovell, Brian C.},
  booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
  year      = {2013},
  pages     = {103-110},
  doi       = {10.1109/WACV.2013.6475006},
  url       = {https://mlanthology.org/wacv/2013/sanin2013wacv-spatio/}
}