Cross-Linked Variational Autoencoders for Generalized Zero-Shot Learning

Abstract

Most approaches in generalized zero-shot learning rely on cross-modal mapping between an image feature space and a class embedding space or on generating artificial image features. However, learning a shared cross-modal embedding by aligning the latent spaces of modality-specific autoencoders is shown to be promising in (generalized) zero-shot learning. While following the same direction, we also take artificial feature generation one step further and propose a model where a shared latent space of image features and class embeddings is learned by aligned variational autoencoders, for the purpose of generating latent features to train a softmax classifier. We evaluate our learned latent features on conventional benchmark datasets and establish a new state of the art on generalized zero-shot as well as on few-shot learning. Moreover, our results on ImageNet with various zero-shot splits show that our latent features generalize well in large-scale settings.

Cite

Text

Schönfeld et al. "Cross-Linked Variational Autoencoders for Generalized Zero-Shot Learning." ICLR 2019 Workshops: LLD, 2019.

Markdown

[Schönfeld et al. "Cross-Linked Variational Autoencoders for Generalized Zero-Shot Learning." ICLR 2019 Workshops: LLD, 2019.](https://mlanthology.org/iclrw/2019/schonfeld2019iclrw-crosslinked/)

BibTeX

@inproceedings{schonfeld2019iclrw-crosslinked,
  title     = {{Cross-Linked Variational Autoencoders for Generalized Zero-Shot Learning}},
  author    = {Schönfeld, Edgar and Ebrahimi, Sayna and Sinha, Samarth and Darrell, Trevor and Akata, Zeynep},
  booktitle = {ICLR 2019 Workshops: LLD},
  year      = {2019},
  url       = {https://mlanthology.org/iclrw/2019/schonfeld2019iclrw-crosslinked/}
}