Video Segmentation and Feature Co-Occurrences for Activity Classification
Abstract
Bag-of-Word scheme has almost become de rigueur for event recognition tasks due to its robustness and simplicity. Despite its effectiveness, this technique discards spatial and temporal relationships between codewords. This paper tackles the problem of building a video codeword representation that captures such relationships. We developed a new method that harnesses spatio-temporal boundaries and discriminative codeword co-occurrences. Given a set of videos and their corresponding quantized features, the video is first decomposed in spatio-temporal volumes according to a multi-scale video segmentation algorithm. Meaningful codeword co-occurrences are then extracted within each volume and videos are then represented with histograms of co-occurring features. The set of histograms is finally fed to an SVM for classification. Evaluation under the realistic TRECVID MED11 challenge database validates the approach.
Cite
Text
Trichet and Nevatia. "Video Segmentation and Feature Co-Occurrences for Activity Classification." IEEE/CVF Winter Conference on Applications of Computer Vision, 2014. doi:10.1109/WACV.2014.6836074Markdown
[Trichet and Nevatia. "Video Segmentation and Feature Co-Occurrences for Activity Classification." IEEE/CVF Winter Conference on Applications of Computer Vision, 2014.](https://mlanthology.org/wacv/2014/trichet2014wacv-video/) doi:10.1109/WACV.2014.6836074BibTeX
@inproceedings{trichet2014wacv-video,
title = {{Video Segmentation and Feature Co-Occurrences for Activity Classification}},
author = {Trichet, Rémi and Nevatia, Ramakant},
booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
year = {2014},
pages = {385-392},
doi = {10.1109/WACV.2014.6836074},
url = {https://mlanthology.org/wacv/2014/trichet2014wacv-video/}
}