Generalized Many-Way Few-Shot Video Classification
Abstract
Few-shot learning methods operate in low data regimes. The aim is to learn with few training examples per class. Although significant progress has been made in few-shot image classification, few-shot video recognition is relatively unexplored and methods based on 2D CNNs are unable to learn temporal information. In this work we thus develop a simple 3D CNN baseline, surpassing existing methods by a large margin. To circumvent the need of labeled examples, we propose to leverage weakly-labeled videos from a large dataset using tag retrieval followed by selecting the best clips with visual similarities, yielding further improvement. Our results saturate current 5-way benchmarks for few-shot video classification and therefore we propose a new challenging benchmark involving more classes and a mixture of classes with varying supervision.
Cite
Text
Xian et al. "Generalized Many-Way Few-Shot Video Classification." European Conference on Computer Vision Workshops, 2020. doi:10.1007/978-3-030-65414-6_10Markdown
[Xian et al. "Generalized Many-Way Few-Shot Video Classification." European Conference on Computer Vision Workshops, 2020.](https://mlanthology.org/eccvw/2020/xian2020eccvw-generalized/) doi:10.1007/978-3-030-65414-6_10BibTeX
@inproceedings{xian2020eccvw-generalized,
title = {{Generalized Many-Way Few-Shot Video Classification}},
author = {Xian, Yongqin and Korbar, Bruno and Douze, Matthijs and Schiele, Bernt and Akata, Zeynep and Torresani, Lorenzo},
booktitle = {European Conference on Computer Vision Workshops},
year = {2020},
pages = {111-127},
doi = {10.1007/978-3-030-65414-6_10},
url = {https://mlanthology.org/eccvw/2020/xian2020eccvw-generalized/}
}