Self-Paced Adversarial Training for Multimodal Few-Shot Learning

Abstract

State-of-the-art deep learning algorithms yield remark-able results in many visual recognition tasks. However, they still fail to provide satisfactory results in scarce data regimes. To a certain extent this lack of data can be compensated by multimodal information. Missing information in one modality of a single data point (e.g. an image) can be made up for in another modality (e.g. a textual description). Therefore, we design a few-shot learning task that is multimodal during training (i.e. image and text) and single-modal during test time (i.e. image). In this regard, we pro-pose a self-paced class-discriminative generative adversarial network incorporating multimodality in the context off ew-shot learning. The proposed approach builds upon the idea of cross-modal data generation in order to alleviate the data sparsity problem. We improve few-shot learning accuracies on the fine grained CUB and Oxford-102 datasets.

Cite

Text

Pahde et al. "Self-Paced Adversarial Training for Multimodal Few-Shot Learning." IEEE/CVF Winter Conference on Applications of Computer Vision, 2019. doi:10.1109/WACV.2019.00029

Markdown

[Pahde et al. "Self-Paced Adversarial Training for Multimodal Few-Shot Learning." IEEE/CVF Winter Conference on Applications of Computer Vision, 2019.](https://mlanthology.org/wacv/2019/pahde2019wacv-self/) doi:10.1109/WACV.2019.00029

BibTeX

@inproceedings{pahde2019wacv-self,
  title     = {{Self-Paced Adversarial Training for Multimodal Few-Shot Learning}},
  author    = {Pahde, Frederik and Ostapenko, Oleksiy and Jähnichen, Patrick and Klein, Tassilo and Nabi, Moin},
  booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
  year      = {2019},
  pages     = {218-226},
  doi       = {10.1109/WACV.2019.00029},
  url       = {https://mlanthology.org/wacv/2019/pahde2019wacv-self/}
}