Zero-Shot Learning via Visual Abstraction
Abstract
One of the main challenges in learning fine-grained visual categories is gathering training images. Recent work in Zero-Shot Learning (ZSL) circumvents this challenge by describing categories via attributes or text. However, not all visual concepts, e.g. , two people dancing, are easily amenable to such descriptions. In this paper, we propose a new modality for ZSL using visual abstraction to learn difficult-to-describe concepts. Specifically, we explore concepts related to people and their interactions with others. Our proposed modality allows one to provide training data by manipulating abstract visualizations, e.g. , one can illustrate interactions between two clipart people by manipulating each person’s pose, expression, gaze, and gender. The feasibility of our approach is shown on a human pose dataset and a new dataset containing complex interactions between two people, where we outperform several baselines. To better match across the two domains, we learn an explicit mapping between the abstract and real worlds.
Cite
Text
Antol et al. "Zero-Shot Learning via Visual Abstraction." European Conference on Computer Vision, 2014. doi:10.1007/978-3-319-10593-2_27Markdown
[Antol et al. "Zero-Shot Learning via Visual Abstraction." European Conference on Computer Vision, 2014.](https://mlanthology.org/eccv/2014/antol2014eccv-zero/) doi:10.1007/978-3-319-10593-2_27BibTeX
@inproceedings{antol2014eccv-zero,
title = {{Zero-Shot Learning via Visual Abstraction}},
author = {Antol, Stanislaw and Zitnick, C. Lawrence and Parikh, Devi},
booktitle = {European Conference on Computer Vision},
year = {2014},
pages = {401-416},
doi = {10.1007/978-3-319-10593-2_27},
url = {https://mlanthology.org/eccv/2014/antol2014eccv-zero/}
}