HDLayout: Hierarchical and Directional Layout Planning for Arbitrary Shaped Visual Text Generation

Abstract

Visual text generation, which aims to generate photo-realistic images with coherent and well-formed scene text being rendered, has attracted widespread attention. Although recent works have achieved promising performance, the limited flexibility and controllability hinder their practical applications. We observe that different from natural objects, visual text in real scenes often has an arbitrarily shaped structure with different granularities (i.e., character, word, or line). In this paper, we consider the modality gap between image and text, and propose a new separation and composition pipeline for flexible and controllable visual text generation from only text prompts. At the core of our framework is a novel Hierarchical and Directional Layout representation, i.e., HDLayout, which can model the sequential and multi-granularity nature of the visual text. Under this formulation, we are able to generate arbitrarily shaped visual text automatically. Extensive experiments demonstrate that our method outperforms several strong baselines in a variety of scenarios both qualitatively and quantitatively, yielding state-of-the-art performances on arbitrarily shaped visual text generation.

Cite

Text

Feng et al. "HDLayout: Hierarchical and Directional Layout Planning for Arbitrary Shaped Visual Text Generation." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I3.32307

Markdown

[Feng et al. "HDLayout: Hierarchical and Directional Layout Planning for Arbitrary Shaped Visual Text Generation." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/feng2025aaai-hdlayout/) doi:10.1609/AAAI.V39I3.32307

BibTeX

@inproceedings{feng2025aaai-hdlayout,
  title     = {{HDLayout: Hierarchical and Directional Layout Planning for Arbitrary Shaped Visual Text Generation}},
  author    = {Feng, Tonghui and Yan, Chunsheng and Wang, Qianru and Cui, Jiangtao and Qiao, Xiaotian},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {2996-3003},
  doi       = {10.1609/AAAI.V39I3.32307},
  url       = {https://mlanthology.org/aaai/2025/feng2025aaai-hdlayout/}
}