Beyond Short Snippets: Deep Networks for Video Classification

Joe Yue-Hei Ng, Matthew Hausknecht, Sudheendra Vijayanarasimhan, Oriol Vinyals, Rajat Monga, George Toderici

CVPR 2015

doi:10.1109/CVPR.2015.7299101 /cvpr/2015/ng2015cvpr-beyond/

Abstract

Convolutional neural networks (CNNs) have been exten- sively applied for image recognition problems giving state- of-the-art results on recognition, detection, segmentation and retrieval. In this work we propose and evaluate several deep neural network architectures to combine image infor- mation across a video over longer time periods than previ- ously attempted. We propose two methods capable of han- dling full length videos. The first method explores various convolutional temporal feature pooling architectures, ex- amining the various design choices which need to be made when adapting a CNN for this task. The second proposed method explicitly models the video as an ordered sequence of frames. For this purpose we employ a recurrent neural network that uses Long Short-Term Memory (LSTM) cells which are connected to the output of the underlying CNN. Our best networks exhibit significant performance improve- ments over previously published results on the Sports 1 mil- lion dataset (73.1% vs. 60.9%) and the UCF-101 datasets with (88.2% vs. 87.9%) and without additional optical flow information (82.6% vs. 72.8%).

PDF CVPR Semantic Scholar

Cite

Text

Ng et al. "Beyond Short Snippets: Deep Networks for Video Classification." Conference on Computer Vision and Pattern Recognition, 2015. doi:10.1109/CVPR.2015.7299101

Markdown

[Ng et al. "Beyond Short Snippets: Deep Networks for Video Classification." Conference on Computer Vision and Pattern Recognition, 2015.](https://mlanthology.org/cvpr/2015/ng2015cvpr-beyond/) doi:10.1109/CVPR.2015.7299101

BibTeX

@inproceedings{ng2015cvpr-beyond,
  title     = {{Beyond Short Snippets: Deep Networks for Video Classification}},
  author    = {Ng, Joe Yue-Hei and Hausknecht, Matthew and Vijayanarasimhan, Sudheendra and Vinyals, Oriol and Monga, Rajat and Toderici, George},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2015},
  doi       = {10.1109/CVPR.2015.7299101},
  url       = {https://mlanthology.org/cvpr/2015/ng2015cvpr-beyond/}
}