Tracklet Descriptors for Action Modeling and Video Analysis

Abstract

We present spatio-temporal feature descriptors that can be inferred from video and used as building blocks in action recognition systems. They capture the evolution of “elementary action elements” under a set of assumptions on the image-formation model and are designed to be insensitive to nuisance variability (absolute position, contrast), while retaining discriminative statistics due to the fine-scale motion and the local shape in compact regions of the image. Despite their simplicity, these descriptors, used in conjunction with basic classifiers, attain state of the art performance in the recognition of actions in benchmark datasets.

Cite

Text

Raptis and Soatto. "Tracklet Descriptors for Action Modeling and Video Analysis." European Conference on Computer Vision, 2010. doi:10.1007/978-3-642-15549-9_42

Markdown

[Raptis and Soatto. "Tracklet Descriptors for Action Modeling and Video Analysis." European Conference on Computer Vision, 2010.](https://mlanthology.org/eccv/2010/raptis2010eccv-tracklet/) doi:10.1007/978-3-642-15549-9_42

BibTeX

@inproceedings{raptis2010eccv-tracklet,
  title     = {{Tracklet Descriptors for Action Modeling and Video Analysis}},
  author    = {Raptis, Michalis and Soatto, Stefano},
  booktitle = {European Conference on Computer Vision},
  year      = {2010},
  pages     = {577-590},
  doi       = {10.1007/978-3-642-15549-9_42},
  url       = {https://mlanthology.org/eccv/2010/raptis2010eccv-tracklet/}
}