Spatio-Temporal Video Representation with Locality-Constrained Linear Coding

Abstract

This paper presents a spatio-temporal coding technique for a video sequence. The framework is based on a space-time extension of scale-invariant feature transform (SIFT) combined with locality-constrained linear coding (LLC). The coding scheme projects each spatio-temporal descriptor into a local coordinate representation produced by max pooling. The extension is evaluated using human action classification tasks. Experiments with the KTH, Weizmann, UCF sports and Hollywood datasets indicate that the approach is able to produce results comparable to the state-of-the-art.

Cite

Text

Al Ghamdi et al. "Spatio-Temporal Video Representation with Locality-Constrained Linear Coding." European Conference on Computer Vision, 2012. doi:10.1007/978-3-642-33885-4_11

Markdown

[Al Ghamdi et al. "Spatio-Temporal Video Representation with Locality-Constrained Linear Coding." European Conference on Computer Vision, 2012.](https://mlanthology.org/eccv/2012/ghamdi2012eccv-spatio-a/) doi:10.1007/978-3-642-33885-4_11

BibTeX

@inproceedings{ghamdi2012eccv-spatio-a,
  title     = {{Spatio-Temporal Video Representation with Locality-Constrained Linear Coding}},
  author    = {Al Ghamdi, Manal and Al Harbi, Nouf and Gotoh, Yoshihiko},
  booktitle = {European Conference on Computer Vision},
  year      = {2012},
  pages     = {101-110},
  doi       = {10.1007/978-3-642-33885-4_11},
  url       = {https://mlanthology.org/eccv/2012/ghamdi2012eccv-spatio-a/}
}