TextCtrl: Diffusion-Based Scene Text Editing with Prior Guidance Control

Abstract

Centred on content modification and style preservation, Scene Text Editing (STE) remains a challenging task despite considerable progress in text-to-image synthesis and text-driven image manipulation recently. GAN-based STE methods generally encounter a common issue of model generalization, while Diffusion-based STE methods suffer from undesired style deviations. To address these problems, we propose TextCtrl, a diffusion-based method that edits text with prior guidance control. Our method consists of two key components: (i) By constructing fine-grained text style disentanglement and robust text glyph structure representation, TextCtrl explicitly incorporates Style-Structure guidance into model design and network training, significantly improving text style consistency and rendering accuracy. (ii) To further leverage the style prior, a Glyph-adaptive Mutual Self-attention mechanism is proposed which deconstructs the implicit fine-grained features of the source image to enhance style consistency and vision quality during inference. Furthermore, to fill the vacancy of the real-world STE evaluation benchmark, we create the first real-world image-pair dataset termed ScenePair for fair comparisons. Experiments demonstrate the effectiveness of TextCtrl compared with previous methods concerning both style fidelity and text accuracy. Project page: https://github.com/weichaozeng/TextCtrl.

Cite

Text

Zeng et al. "TextCtrl: Diffusion-Based Scene Text Editing with Prior Guidance Control." Neural Information Processing Systems, 2024. doi:10.52202/079017-4396

Markdown

[Zeng et al. "TextCtrl: Diffusion-Based Scene Text Editing with Prior Guidance Control." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/zeng2024neurips-textctrl/) doi:10.52202/079017-4396

BibTeX

@inproceedings{zeng2024neurips-textctrl,
  title     = {{TextCtrl: Diffusion-Based Scene Text Editing with Prior Guidance Control}},
  author    = {Zeng, Weichao and Shu, Yan and Li, Zhenhang and Yang, Dongbao and Zhou, Yu},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-4396},
  url       = {https://mlanthology.org/neurips/2024/zeng2024neurips-textctrl/}
}