Building a Size Constrained Predictive Models for Video Classification

Skalic, Miha; Austin, David

doi:10.1007/978-3-030-11018-5_27

Building a Size Constrained Predictive Models for Video Classification

Miha Skalic, David Austin

ECCVW 2018 pp. 297-305

doi:10.1007/978-3-030-11018-5_27 /eccvw/2018/skalic2018eccvw-building/

Abstract

Herein we present the solution to the $2^\mathrm{nd}$ YouTube-8M video understanding challenge which placed $1^\mathrm{st}$ . Competition participants were tasked with building a size constrained video labeling model with a model size of less than 1 GB. Our final solution consists of several submodels belonging to Fisher vectors, NetVlad, Deep Bag of Frames and Recurrent neural networks model families. To make the classifier efficient under size constraints we introduced model distillation, partial weights quantization and training with exponential moving average.

PDF ECCVW Semantic Scholar

Cite

Text

Skalic and Austin. "Building a Size Constrained Predictive Models for Video Classification." European Conference on Computer Vision Workshops, 2018. doi:10.1007/978-3-030-11018-5_27

Markdown

[Skalic and Austin. "Building a Size Constrained Predictive Models for Video Classification." European Conference on Computer Vision Workshops, 2018.](https://mlanthology.org/eccvw/2018/skalic2018eccvw-building/) doi:10.1007/978-3-030-11018-5_27

BibTeX

@inproceedings{skalic2018eccvw-building,
  title     = {{Building a Size Constrained Predictive Models for Video Classification}},
  author    = {Skalic, Miha and Austin, David},
  booktitle = {European Conference on Computer Vision Workshops},
  year      = {2018},
  pages     = {297-305},
  doi       = {10.1007/978-3-030-11018-5_27},
  url       = {https://mlanthology.org/eccvw/2018/skalic2018eccvw-building/}
}