Efficient Action Spotting Based on a Spacetime Oriented Structure Representation

Abstract

This paper addresses action spotting, the spatiotemporal detection and localization of human actions in video. A novel compact local descriptor of video dynamics in the context of action spotting is introduced based on visual spacetime oriented energy measurements. This descriptor is efficiently computed directly from raw image intensity data and thereby forgoes the problems typically associated with flow-based features. An important aspect of the descriptor is that it allows for the comparison of the underlying dynamics of two spacetime video segments irrespective of spatial appearance, such as differences induced by clothing, and with robustness to clutter. An associated similarity measure is introduced that admits efficient exhaustive search for an action template across candidate video sequences. Empirical evaluation of the approach on a set of challenging natural videos suggests its efficacy.

Cite

Text

Derpanis et al. "Efficient Action Spotting Based on a Spacetime Oriented Structure Representation." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2010. doi:10.1109/CVPR.2010.5539874

Markdown

[Derpanis et al. "Efficient Action Spotting Based on a Spacetime Oriented Structure Representation." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2010.](https://mlanthology.org/cvpr/2010/derpanis2010cvpr-efficient/) doi:10.1109/CVPR.2010.5539874

BibTeX

@inproceedings{derpanis2010cvpr-efficient,
  title     = {{Efficient Action Spotting Based on a Spacetime Oriented Structure Representation}},
  author    = {Derpanis, Konstantinos G. and Sizintsev, Mikhail and Cannons, Kevin J. and Wildes, Richard P.},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2010},
  pages     = {1990-1997},
  doi       = {10.1109/CVPR.2010.5539874},
  url       = {https://mlanthology.org/cvpr/2010/derpanis2010cvpr-efficient/}
}