Semantically Consistent Hierarchical Text to Fashion Image Synthesis with an Enhanced-Attentional Generative Adversarial Network
Abstract
In this paper, we present the enhanced Attentional Generative Adversarial Network (e-AttnGAN) with improved training stability for text-to-image synthesis. e-AttnGAN's integrated attention module utilizes both sentence and word context features and performs feature-wise linear modulation (FiLM) to fuse visual and natural language representations. In addition to multimodal similarity learning for text and image features of AttnGAN, cosine and feature matching losses of real and generated images are included while employing a classification loss for "significant attributes". In order to improve the stability of the training and solve the issue of model collapse, spectral normalization and two-time scale update for the discriminator are used together with instance noise. Our experiments show that e-AttnGAN outperforms state-of-the-art methods on the FashionGen and DeepFashion-Synthesis datasets.
Cite
Text
Ak et al. "Semantically Consistent Hierarchical Text to Fashion Image Synthesis with an Enhanced-Attentional Generative Adversarial Network." IEEE/CVF International Conference on Computer Vision Workshops, 2019. doi:10.1109/ICCVW.2019.00379Markdown
[Ak et al. "Semantically Consistent Hierarchical Text to Fashion Image Synthesis with an Enhanced-Attentional Generative Adversarial Network." IEEE/CVF International Conference on Computer Vision Workshops, 2019.](https://mlanthology.org/iccvw/2019/ak2019iccvw-semantically/) doi:10.1109/ICCVW.2019.00379BibTeX
@inproceedings{ak2019iccvw-semantically,
title = {{Semantically Consistent Hierarchical Text to Fashion Image Synthesis with an Enhanced-Attentional Generative Adversarial Network}},
author = {Ak, Kenan Emir and Lim, Joo Hwee and Tham, Jo Yew and Kassim, Ashraf A.},
booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
year = {2019},
pages = {3121-3124},
doi = {10.1109/ICCVW.2019.00379},
url = {https://mlanthology.org/iccvw/2019/ak2019iccvw-semantically/}
}