3D Noise and Adversarial Supervision Is All You Need for Multi-Modal Semantic Image Synthesis

Sushko, Vadim; Schönfeld, Edgar; Zhang, Dan; Gall, Jürgen; Schiele, Bernt; Khoreva, Anna

doi:10.1007/978-3-030-65414-6_39

3D Noise and Adversarial Supervision Is All You Need for Multi-Modal Semantic Image Synthesis

Vadim Sushko, Edgar Schönfeld, Dan Zhang, Jürgen Gall, Bernt Schiele, Anna Khoreva

ECCVW 2020 pp. 554-558

doi:10.1007/978-3-030-65414-6_39 /eccvw/2020/sushko2020eccvw-3d/

Abstract

Semantic image synthesis models suffer from training instabilities and poor image quality when trained with adversarial supervision alone. Historically, this was alleviated via an additional VGG-based perceptual loss. Hence, we propose a new simplified GAN model, which needs only adversarial supervision to achieve high-quality results. In doing so, we also show that the VGG supervision decreases image diversity and can hurt image quality. We achieve the improvement by re-designing the discriminator as a semantic segmentation network. The resulting stronger supervision makes the VGG loss obsolete. Moreover, in contrast to previous work, we enable high-quality multi-modal image synthesis through a novel noise sampling scheme. Compared to the state of the art, we achieve an average improvement of 6 FID and 7 mIoU.

PDF ECCVW Semantic Scholar

Cite

Text

Sushko et al. "3D Noise and Adversarial Supervision Is All You Need for Multi-Modal Semantic Image Synthesis." European Conference on Computer Vision Workshops, 2020. doi:10.1007/978-3-030-65414-6_39

Markdown

[Sushko et al. "3D Noise and Adversarial Supervision Is All You Need for Multi-Modal Semantic Image Synthesis." European Conference on Computer Vision Workshops, 2020.](https://mlanthology.org/eccvw/2020/sushko2020eccvw-3d/) doi:10.1007/978-3-030-65414-6_39

BibTeX

@inproceedings{sushko2020eccvw-3d,
  title     = {{3D Noise and Adversarial Supervision Is All You Need for Multi-Modal Semantic Image Synthesis}},
  author    = {Sushko, Vadim and Schönfeld, Edgar and Zhang, Dan and Gall, Jürgen and Schiele, Bernt and Khoreva, Anna},
  booktitle = {European Conference on Computer Vision Workshops},
  year      = {2020},
  pages     = {554-558},
  doi       = {10.1007/978-3-030-65414-6_39},
  url       = {https://mlanthology.org/eccvw/2020/sushko2020eccvw-3d/}
}