Instructing Text-to-Image Diffusion Models via Classifier-Guided Semantic Optimization

Chang, Yuanyuan; Yao, Yinghua; Qin, Tao; Wang, Mengmeng; Tsang, Ivor W.; Dai, Guang

doi:10.24963/IJCAI.2025/84

Instructing Text-to-Image Diffusion Models via Classifier-Guided Semantic Optimization

Yuanyuan Chang, Yinghua Yao, Tao Qin, Mengmeng Wang, Ivor W. Tsang, Guang Dai

IJCAI 2025 pp. 747-755

doi:10.24963/IJCAI.2025/84 /ijcai/2025/chang2025ijcai-instructing/

Abstract

Text-to-image diffusion models have emerged as powerful tools for high-quality image generation and editing. Many existing approaches rely on text prompts as editing guidance. However, these methods are constrained by the need for manual prompt crafting, which can be time-consuming, introduce irrelevant details, and significantly limit editing performance. In this work, we propose optimizing semantic embeddings guided by attribute classifiers to steer text-to-image models toward desired edits, without relying on text prompts or requiring any training or fine-tuning of the diffusion model. We utilize classifiers to learn precise semantic embeddings at the dataset level. The learned embeddings are theoretically justified as the optimal representation of attribute semantics, enabling disentangled and accurate edits. Experiments further demonstrate that our method achieves high levels of disentanglement and strong generalization across different domains of data. Code is available at https://github.com/Chang-yuanyuan/CASO.

PDF IJCAI Semantic Scholar

Cite

Text

Chang et al. "Instructing Text-to-Image Diffusion Models via Classifier-Guided Semantic Optimization." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/84

Markdown

[Chang et al. "Instructing Text-to-Image Diffusion Models via Classifier-Guided Semantic Optimization." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/chang2025ijcai-instructing/) doi:10.24963/IJCAI.2025/84

BibTeX

@inproceedings{chang2025ijcai-instructing,
  title     = {{Instructing Text-to-Image Diffusion Models via Classifier-Guided Semantic Optimization}},
  author    = {Chang, Yuanyuan and Yao, Yinghua and Qin, Tao and Wang, Mengmeng and Tsang, Ivor W. and Dai, Guang},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {747-755},
  doi       = {10.24963/IJCAI.2025/84},
  url       = {https://mlanthology.org/ijcai/2025/chang2025ijcai-instructing/}
}