Storynizor: Consistent Story Generation via Inter-Frame Synchronized and Shuffled ID Injection

Abstract

Recent advances in text-to-image diffusion models have spurred significant interest in continuous story image generation. In this paper, we introduce Storynizor, a model capable of generating coherent stories with strong inter-frame character consistency, effective foreground-background separation, and diverse pose variation. The core innovation of Storynizor lies in its key modules: ID-Synchronizer and ID-Injector. The ID-Synchronizer employs an auto-mask self-attention module and a mask perceptual loss across inter-frame images to improve the consistency of character generation, vividly representing their postures and backgrounds. The ID-Injector utilize a Shuffling Reference Strategy (SRS) to integrate ID features into specific locations, enhancing ID-based consistent character generation. Additionally, to facilitate the training of Storynizor, we have curated a novel dataset called StoryDB comprising 100, 000 images. This dataset contains single and multiple-character sets in diverse environments, layouts, and gestures with detailed descriptions. Experimental results indicate that Storynizor demonstrates superior coherent story generation with high-fidelity character consistency, flexible postures, and vivid backgrounds compared to other character-specific methods.

Cite

Text

Ma et al. "Storynizor: Consistent Story Generation via Inter-Frame Synchronized and Shuffled ID Injection." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I6.32644

Markdown

[Ma et al. "Storynizor: Consistent Story Generation via Inter-Frame Synchronized and Shuffled ID Injection." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/ma2025aaai-storynizor/) doi:10.1609/AAAI.V39I6.32644

BibTeX

@inproceedings{ma2025aaai-storynizor,
  title     = {{Storynizor: Consistent Story Generation via Inter-Frame Synchronized and Shuffled ID Injection}},
  author    = {Ma, Yuhang and Xu, Wenting and Zhao, Chaoyi and Sun, Keqiang and Jin, Qinfeng and Yang, Xiaoda and Zhao, Zeng and Fan, Changjie and Hu, Zhipeng},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {6027-6035},
  doi       = {10.1609/AAAI.V39I6.32644},
  url       = {https://mlanthology.org/aaai/2025/ma2025aaai-storynizor/}
}