CLAREL: Classification via Retrieval Loss for Zero-Shot Learning
Abstract
We address the problem of learning cross-modal representations. We propose an instance-based deep metric learning approach in joint visual and textual space. The key novelty of this paper is that it shows that using per-image semantic supervision leads to substantial improvement in zero-shot performance over using class-only supervision. We also provide a probabilistic justification and empirical validation for a metric rescaling approach to balance the seen/unseen accuracy in the GZSL task. We evaluate our approach on two fine-grained zero-shot datasets: cub and flowers.
Cite
Text
Oreshkin et al. "CLAREL: Classification via Retrieval Loss for Zero-Shot Learning." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020. doi:10.1109/CVPRW50498.2020.00466Markdown
[Oreshkin et al. "CLAREL: Classification via Retrieval Loss for Zero-Shot Learning." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020.](https://mlanthology.org/cvprw/2020/oreshkin2020cvprw-clarel/) doi:10.1109/CVPRW50498.2020.00466BibTeX
@inproceedings{oreshkin2020cvprw-clarel,
title = {{CLAREL: Classification via Retrieval Loss for Zero-Shot Learning}},
author = {Oreshkin, Boris N. and Rostamzadeh, Negar and Pinheiro, Pedro O. and Pal, Christopher J.},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2020},
pages = {3989-3993},
doi = {10.1109/CVPRW50498.2020.00466},
url = {https://mlanthology.org/cvprw/2020/oreshkin2020cvprw-clarel/}
}