DLT: Conditioned Layout Generation with Joint Discrete-Continuous Diffusion Layout Transformer

Abstract

Generating visual layouts is an essential ingredient of graphic design. The ability to condition layout generation on a partial subset of component attributes is critical to real-world applications that involve user interaction. Recently, diffusion models have demonstrated high-quality generative performances in various domains. However, it is unclear how to apply diffusion models to the natural representation of layouts which consists of a mix of discrete (class) and continuous (location, size) attributes. To address the conditioning layout generation problem, we introduce DLT, a joint discrete-continuous diffusion model. DLT is a transformer-based model which has a flexible conditioning mechanism that allows for conditioning on any given subset of all layout components classes, locations and sizes. Our method outperforms state-of-the-art generative models on various layout generation datasets with respect to different metrics and conditioning settings. Additionally, we validate the effectiveness of our proposed conditioning mechanism and the joint continuous-diffusion process. This joint process can be incorporated into a wide range of mixed discrete-continuous generative tasks.

Cite

Text

Levi et al. "DLT: Conditioned Layout Generation with Joint Discrete-Continuous Diffusion Layout Transformer." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00201

Markdown

[Levi et al. "DLT: Conditioned Layout Generation with Joint Discrete-Continuous Diffusion Layout Transformer." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/levi2023iccv-dlt/) doi:10.1109/ICCV51070.2023.00201

BibTeX

@inproceedings{levi2023iccv-dlt,
  title     = {{DLT: Conditioned Layout Generation with Joint Discrete-Continuous Diffusion Layout Transformer}},
  author    = {Levi, Elad and Brosh, Eli and Mykhailych, Mykola and Perez, Meir},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {2106-2115},
  doi       = {10.1109/ICCV51070.2023.00201},
  url       = {https://mlanthology.org/iccv/2023/levi2023iccv-dlt/}
}