Tracklet Descriptors for Action Modeling and Video Analysis
Abstract
We present spatio-temporal feature descriptors that can be inferred from video and used as building blocks in action recognition systems. They capture the evolution of “elementary action elements” under a set of assumptions on the image-formation model and are designed to be insensitive to nuisance variability (absolute position, contrast), while retaining discriminative statistics due to the fine-scale motion and the local shape in compact regions of the image. Despite their simplicity, these descriptors, used in conjunction with basic classifiers, attain state of the art performance in the recognition of actions in benchmark datasets.
Cite
Text
Raptis and Soatto. "Tracklet Descriptors for Action Modeling and Video Analysis." European Conference on Computer Vision, 2010. doi:10.1007/978-3-642-15549-9_42Markdown
[Raptis and Soatto. "Tracklet Descriptors for Action Modeling and Video Analysis." European Conference on Computer Vision, 2010.](https://mlanthology.org/eccv/2010/raptis2010eccv-tracklet/) doi:10.1007/978-3-642-15549-9_42BibTeX
@inproceedings{raptis2010eccv-tracklet,
title = {{Tracklet Descriptors for Action Modeling and Video Analysis}},
author = {Raptis, Michalis and Soatto, Stefano},
booktitle = {European Conference on Computer Vision},
year = {2010},
pages = {577-590},
doi = {10.1007/978-3-642-15549-9_42},
url = {https://mlanthology.org/eccv/2010/raptis2010eccv-tracklet/}
}