MAT2I: Enhancing Perceptual Authenticity in Text-to-Image Synthesis Using Multi-Attribute Generative Adversarial Networks

Singh, Varsha; Singh, Vijai; Tiwary, Uma Shanker

doi:10.1613/JAIR.1.18237

MAT2I: Enhancing Perceptual Authenticity in Text-to-Image Synthesis Using Multi-Attribute Generative Adversarial Networks

Varsha Singh, Vijai Singh, Uma Shanker Tiwary

JAIR 2025 pp. 2453-2469

doi:10.1613/JAIR.1.18237 /jair/2025/singh2025jair-mat2i/

Abstract

Generating visuals from text involves deriving visual representations from textual descriptions and transforming them into corresponding visuals. This technique finds vast application in various fields, such as graphic design and image editing. Generative adversarial networks (GANs) are the widely used and better performers for this task. A primary hurdle in this process is producing perceptually authentic visuals. This study introduces a MultiAttribute Text to Image Synthesis Generative Adversarial Network (MAT2I) to address these challenges. The enhancements encompass attribute-control-net, feature alignment, and perceptual loss. The attribute-control-net is used for the fast and attribute-specific generation to maintain authenticity in perceptuality with adaptability. Feature alignment and perceptual loss motivate the generator to create visuals that closely resemble real visuals based on the accompanying text and to reduce randomness. The effectiveness of the proposed model is gauged on the CUB and COCO datasets. Empirical findings illustrate that this approach generates visuals with greater content diversity, enhanced realism, and improved semantic alignment with provided text descriptions. Furthermore, the proposed method surpasses comparative techniques in terms of inception score, further establishing its competitive performance.

PDF JAIR Semantic Scholar

Cite

Text

Singh et al. "MAT2I: Enhancing Perceptual Authenticity in Text-to-Image Synthesis Using Multi-Attribute Generative Adversarial Networks." Journal of Artificial Intelligence Research, 2025. doi:10.1613/JAIR.1.18237

Markdown

[Singh et al. "MAT2I: Enhancing Perceptual Authenticity in Text-to-Image Synthesis Using Multi-Attribute Generative Adversarial Networks." Journal of Artificial Intelligence Research, 2025.](https://mlanthology.org/jair/2025/singh2025jair-mat2i/) doi:10.1613/JAIR.1.18237

BibTeX

@article{singh2025jair-mat2i,
  title     = {{MAT2I: Enhancing Perceptual Authenticity in Text-to-Image Synthesis Using Multi-Attribute Generative Adversarial Networks}},
  author    = {Singh, Varsha and Singh, Vijai and Tiwary, Uma Shanker},
  journal   = {Journal of Artificial Intelligence Research},
  year      = {2025},
  pages     = {2453-2469},
  doi       = {10.1613/JAIR.1.18237},
  volume    = {82},
  url       = {https://mlanthology.org/jair/2025/singh2025jair-mat2i/}
}