Action Recognition in Spatiotemporal Volume

Abstract

We recognize actions and activities in video sequences as distinguishing patterns in the 3D spatiotemporal volume of motion energy. Local motion descriptors, which capture highly discriminative invariant motion characteristics in a spherical neighborhood, are computed in the 3D volume at points of salient motion to represent actions or activities in video sequences. Two actions are then matched based on the similarity between their representing motion descriptors. Our action recognition algorithm using the new motion descriptors has achieved an accuracy rate of 98.6% on the Weizmann action dataset.

Cite

Text

Zhong and Stevens. "Action Recognition in Spatiotemporal Volume." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2010. doi:10.1109/CVPRW.2010.5543836

Markdown

[Zhong and Stevens. "Action Recognition in Spatiotemporal Volume." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2010.](https://mlanthology.org/cvprw/2010/zhong2010cvprw-action/) doi:10.1109/CVPRW.2010.5543836

BibTeX

@inproceedings{zhong2010cvprw-action,
  title     = {{Action Recognition in Spatiotemporal Volume}},
  author    = {Zhong, Yu and Stevens, Mark},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2010},
  pages     = {25-30},
  doi       = {10.1109/CVPRW.2010.5543836},
  url       = {https://mlanthology.org/cvprw/2010/zhong2010cvprw-action/}
}