Action Recognition with Multiscale Spatio-Temporal Contexts

Abstract

The popular bag of words approach for action recognition is based on the classifying quantized local features density. This approach focuses excessively on the local features but discards all information about the interactions among them. Local features themselves may not be discriminative enough, but combined with their contexts, they can be very useful for the recognition of some actions. In this paper, we present a novel representation that captures contextual interactions between interest points, based on the density of all features observed in each interest point's mutliscale spatio-temporal contextual domain. We demonstrate that augmenting local features with our contextual feature significantly improves the recognition performance.

Cite

Text

Wang et al. "Action Recognition with Multiscale Spatio-Temporal Contexts." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2011. doi:10.1109/CVPR.2011.5995493

Markdown

[Wang et al. "Action Recognition with Multiscale Spatio-Temporal Contexts." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2011.](https://mlanthology.org/cvpr/2011/wang2011cvpr-action/) doi:10.1109/CVPR.2011.5995493

BibTeX

@inproceedings{wang2011cvpr-action,
  title     = {{Action Recognition with Multiscale Spatio-Temporal Contexts}},
  author    = {Wang, Jiang and Chen, Zhuoyuan and Wu, Ying},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2011},
  pages     = {3185-3192},
  doi       = {10.1109/CVPR.2011.5995493},
  url       = {https://mlanthology.org/cvpr/2011/wang2011cvpr-action/}
}