Video Segmentation and Feature Co-Occurrences for Activity Classification

Abstract

Bag-of-Word scheme has almost become de rigueur for event recognition tasks due to its robustness and simplicity. Despite its effectiveness, this technique discards spatial and temporal relationships between codewords. This paper tackles the problem of building a video codeword representation that captures such relationships. We developed a new method that harnesses spatio-temporal boundaries and discriminative codeword co-occurrences. Given a set of videos and their corresponding quantized features, the video is first decomposed in spatio-temporal volumes according to a multi-scale video segmentation algorithm. Meaningful codeword co-occurrences are then extracted within each volume and videos are then represented with histograms of co-occurring features. The set of histograms is finally fed to an SVM for classification. Evaluation under the realistic TRECVID MED11 challenge database validates the approach.

Cite

Text

Trichet and Nevatia. "Video Segmentation and Feature Co-Occurrences for Activity Classification." IEEE/CVF Winter Conference on Applications of Computer Vision, 2014. doi:10.1109/WACV.2014.6836074

Markdown

[Trichet and Nevatia. "Video Segmentation and Feature Co-Occurrences for Activity Classification." IEEE/CVF Winter Conference on Applications of Computer Vision, 2014.](https://mlanthology.org/wacv/2014/trichet2014wacv-video/) doi:10.1109/WACV.2014.6836074

BibTeX

@inproceedings{trichet2014wacv-video,
  title     = {{Video Segmentation and Feature Co-Occurrences for Activity Classification}},
  author    = {Trichet, Rémi and Nevatia, Ramakant},
  booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
  year      = {2014},
  pages     = {385-392},
  doi       = {10.1109/WACV.2014.6836074},
  url       = {https://mlanthology.org/wacv/2014/trichet2014wacv-video/}
}