Modeling the Temporal Extent of Actions
Abstract
In this paper, we present a framework for estimating what portions of videos are most discriminative for the task of action recognition. We explore the impact of the temporal cropping of training videos on the overall accuracy of an action recognition system, and we formalize what makes a set of croppings optimal. In addition, we present an algorithm to determine the best set of croppings for a dataset, and experimentally show that our approach increases the accuracy of various state-of-the-art action recognition techniques.
Cite
Text
Satkin and Hebert. "Modeling the Temporal Extent of Actions." European Conference on Computer Vision, 2010. doi:10.1007/978-3-642-15549-9_39Markdown
[Satkin and Hebert. "Modeling the Temporal Extent of Actions." European Conference on Computer Vision, 2010.](https://mlanthology.org/eccv/2010/satkin2010eccv-modeling/) doi:10.1007/978-3-642-15549-9_39BibTeX
@inproceedings{satkin2010eccv-modeling,
title = {{Modeling the Temporal Extent of Actions}},
author = {Satkin, Scott and Hebert, Martial},
booktitle = {European Conference on Computer Vision},
year = {2010},
pages = {536-548},
doi = {10.1007/978-3-642-15549-9_39},
url = {https://mlanthology.org/eccv/2010/satkin2010eccv-modeling/}
}