ACTIVE: Activity Concept Transitions in Video Event Classification
Abstract
The goal of high level event classification from videos is to assign a single, high level event label to each query video. Traditional approaches represent each video as a set of low level features and encode it into a fixed length feature vector (e.g. Bag-of-Words), which leave a big gap between low level visual features and high level events. Our paper tries to address this problem by exploiting activity concept transitions in video events (ACTIVE). A video is treated as a sequence of short clips, all of which are observations corresponding to latent activity concept variables in a Hidden Markov Model (HMM). We propose to apply Fisher Kernel techniques so that the concept transitions over time can be encoded into a compact and fixed length feature vector very efficiently. Our approach can utilize concept annotations from independent datasets, and works well even with a very small number of training samples. Experiments on the challenging NIST TRECVID Multimedia Event Detection (MED) dataset shows our approach performs favorably over the state-of-the-art.
Cite
Text
Sun and Nevatia. "ACTIVE: Activity Concept Transitions in Video Event Classification." International Conference on Computer Vision, 2013. doi:10.1109/ICCV.2013.453Markdown
[Sun and Nevatia. "ACTIVE: Activity Concept Transitions in Video Event Classification." International Conference on Computer Vision, 2013.](https://mlanthology.org/iccv/2013/sun2013iccv-active/) doi:10.1109/ICCV.2013.453BibTeX
@inproceedings{sun2013iccv-active,
title = {{ACTIVE: Activity Concept Transitions in Video Event Classification}},
author = {Sun, Chen and Nevatia, Ram},
booktitle = {International Conference on Computer Vision},
year = {2013},
doi = {10.1109/ICCV.2013.453},
url = {https://mlanthology.org/iccv/2013/sun2013iccv-active/}
}