Story Visualization by Online Text Augmentation with Context Memory

Abstract

Story visualization (SV) is a challenging text-to-image generation task for the difficulty of not only rendering visual details from the text descriptions but also encoding a longterm context across multiple sentences. While prior efforts mostly focus on generating a semantically relevant image for each sentence, encoding a context spread across the given paragraph to generate contextually convincing images (e.g., with a correct character or with a proper background of the scene) remains a challenge. To this end, we propose a novel memory architecture for the Bi-directional Transformer framework with an online text augmentation that generates multiple pseudo-descriptions as supplementary supervision during training for better generalization to the language variation at inference. In extensive experiments on the two popular SV benchmarks, i.e., the Pororo-SV and Flintstones-SV, the proposed method significantly outperforms the state of the arts in various metrics including FID, character F1, frame accuracy, BLEU-2/3, and R-precision with similar or less computational complexity.

Cite

Text

Ahn et al. "Story Visualization by Online Text Augmentation with Context Memory." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00290

Markdown

[Ahn et al. "Story Visualization by Online Text Augmentation with Context Memory." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/ahn2023iccv-story/) doi:10.1109/ICCV51070.2023.00290

BibTeX

@inproceedings{ahn2023iccv-story,
  title     = {{Story Visualization by Online Text Augmentation with Context Memory}},
  author    = {Ahn, Daechul and Kim, Daneul and Song, Gwangmo and Kim, Seung Hwan and Lee, Honglak and Kang, Dongyeop and Choi, Jonghyun},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {3125-3135},
  doi       = {10.1109/ICCV51070.2023.00290},
  url       = {https://mlanthology.org/iccv/2023/ahn2023iccv-story/}
}