Improved Adversarial Image Captioning
Abstract
In this paper we study image captioning as a conditional GAN training, proposing both a context-aware LSTM captioner and co-attentive discriminator, which enforces semantic alignment between images and captions. We investigate the viability of two discrete GAN training methods: Self-critical Sequence Training (SCST) and Gumbel Straight-Through (ST) and demonstrate that SCST shows more stable gradient behavior and improved results over Gumbel ST.
Cite
Text
Dognin et al. "Improved Adversarial Image Captioning." ICLR 2019 Workshops: DeepGenStruct, 2019.Markdown
[Dognin et al. "Improved Adversarial Image Captioning." ICLR 2019 Workshops: DeepGenStruct, 2019.](https://mlanthology.org/iclrw/2019/dognin2019iclrw-improved/)BibTeX
@inproceedings{dognin2019iclrw-improved,
title = {{Improved Adversarial Image Captioning}},
author = {Dognin, Pierre and Melnyk, Igor and Mroueh, Youssef and Ross, Jarret and Sercu, Tom},
booktitle = {ICLR 2019 Workshops: DeepGenStruct},
year = {2019},
url = {https://mlanthology.org/iclrw/2019/dognin2019iclrw-improved/}
}