Embedding Sequential Information into Spatiotemporal Features for Action Recognition

Ye, Yuancheng; Tian, Yingli

doi:10.1109/CVPRW.2016.142

Embedding Sequential Information into Spatiotemporal Features for Action Recognition

Yuancheng Ye, Yingli Tian

CVPRW 2016 pp. 1110-1118

doi:10.1109/CVPRW.2016.142 /cvprw/2016/ye2016cvprw-embedding/

Abstract

In this paper, we introduce a novel framework for video-based action recognition, which incorporates the sequential information with the spatiotemporal features. Specifically, the spatiotemporal features are extracted from the sliced clips of videos, and then a recurrent neural network is applied to embed the sequential information into the final feature representation of the video. In contrast to most current deep learning methods for the video-based tasks, our framework incorporates both long-term dependencies and spatiotemporal information of the clips in the video. To extract the spatiotemporal features from the clips, both dense trajectories (DT) and a newly proposed 3D neural network, C3D, are applied in our experiments. Our proposed framework is evaluated on the benchmark datasets of UCF101 and HMDB51, and achieves comparable performance compared with the state-of-the-art results.

CVPRW Semantic Scholar

Cite

Text

Ye and Tian. "Embedding Sequential Information into Spatiotemporal Features for Action Recognition." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2016. doi:10.1109/CVPRW.2016.142

Markdown

[Ye and Tian. "Embedding Sequential Information into Spatiotemporal Features for Action Recognition." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2016.](https://mlanthology.org/cvprw/2016/ye2016cvprw-embedding/) doi:10.1109/CVPRW.2016.142

BibTeX

@inproceedings{ye2016cvprw-embedding,
  title     = {{Embedding Sequential Information into Spatiotemporal Features for Action Recognition}},
  author    = {Ye, Yuancheng and Tian, Yingli},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2016},
  pages     = {1110-1118},
  doi       = {10.1109/CVPRW.2016.142},
  url       = {https://mlanthology.org/cvprw/2016/ye2016cvprw-embedding/}
}