Zero-Shot Synthesis with Group-Supervised Learning

Abstract

Visual cognition of primates is superior to that of artificial neural networks in its ability to “envision” a visual object, even a newly-introduced one, in different attributes including pose, position, color, texture, etc. To aid neural networks to envision objects with different attributes, we propose a family of objective functions, expressed on groups of examples, as a novel learning framework that we term Group-Supervised Learning (GSL). GSL allows us to decompose inputs into a disentangled representation with swappable components, that can be recombined to synthesize new samples. For instance, images of red boats & blue cars can be decomposed and recombined to synthesize novel images of red cars. We propose an implementation based on auto-encoder, termed group-supervised zero-shot synthesis network (GZS-Net) trained with our learning framework, that can produce a high-quality red car even if no such example is witnessed during training. We test our model and learning framework on existing benchmarks, in addition to a new dataset that we open-source. We qualitatively and quantitatively demonstrate that GZS-Net trained with GSL outperforms state-of-the-art methods

Cite

Text

Ge et al. "Zero-Shot Synthesis with Group-Supervised Learning." International Conference on Learning Representations, 2021.

Markdown

[Ge et al. "Zero-Shot Synthesis with Group-Supervised Learning." International Conference on Learning Representations, 2021.](https://mlanthology.org/iclr/2021/ge2021iclr-zeroshot/)

BibTeX

@inproceedings{ge2021iclr-zeroshot,
  title     = {{Zero-Shot Synthesis with Group-Supervised Learning}},
  author    = {Ge, Yunhao and Abu-El-Haija, Sami and Xin, Gan and Itti, Laurent},
  booktitle = {International Conference on Learning Representations},
  year      = {2021},
  url       = {https://mlanthology.org/iclr/2021/ge2021iclr-zeroshot/}
}