Learning Video Features for Multi-Label Classification

Garg, Shivam

doi:10.1007/978-3-030-11018-5_30

Learning Video Features for Multi-Label Classification

Shivam Garg

ECCVW 2018 pp. 325-337

doi:10.1007/978-3-030-11018-5_30 /eccvw/2018/garg2018eccvw-learning/

Abstract

This paper studies some approaches to learn representation of videos. This work was done as a part of Youtube-8M Video Understanding Challenge. The main focus is to analyze various approaches used to model temporal data and evaluate the performance of such approaches on this problem. Also, a model is proposed which reduces the size of feature vector by 70% but does not compromise on accuracy. The first approach is to use recurrent neural network architectures to learn a single video level feature from frame level features and then use this aggregated feature to do multi-label classification. The second approach is to use video level features and deep neural networks to assign the labels.

PDF ECCVW Semantic Scholar

Cite

Text

Garg. "Learning Video Features for Multi-Label Classification." European Conference on Computer Vision Workshops, 2018. doi:10.1007/978-3-030-11018-5_30

Markdown

[Garg. "Learning Video Features for Multi-Label Classification." European Conference on Computer Vision Workshops, 2018.](https://mlanthology.org/eccvw/2018/garg2018eccvw-learning/) doi:10.1007/978-3-030-11018-5_30

BibTeX

@inproceedings{garg2018eccvw-learning,
  title     = {{Learning Video Features for Multi-Label Classification}},
  author    = {Garg, Shivam},
  booktitle = {European Conference on Computer Vision Workshops},
  year      = {2018},
  pages     = {325-337},
  doi       = {10.1007/978-3-030-11018-5_30},
  url       = {https://mlanthology.org/eccvw/2018/garg2018eccvw-learning/}
}