Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation

Abstract

Fine-tuning Diffusion Models remains an underexplored frontier in generative artificial intelligence (GenAI), especially when compared with the remarkable progress made in fine-tuning Large Language Models (LLMs). While cutting-edge diffusion models such as Stable Diffusion (SD) and SDXL rely on supervised fine-tuning, their performance inevitably plateaus after seeing a certain volume of data. Recently, reinforcement learning (RL) has been employed to fine-tune diffusion models with human preference data, but it requires at least two images (winner'' andloser'' images) for each text prompt.In this paper, we introduce an innovative technique called self-play fine-tuning for diffusion models (SPIN-Diffusion), where the diffusion model engages in competition with its earlier versions, facilitating an iterative self-improvement process. Our approach offers an alternative to conventional supervised fine-tuning and RL strategies, significantly improving both model performance and alignment. Our experiments on the Pick-a-Pic dataset reveal that SPIN-Diffusion outperforms the existing supervised fine-tuning method in aspects of human preference alignment and visual appeal right from its first iteration. By the second iteration, it exceeds the performance of RLHF-based methods across all metrics, achieving these results with less data. Codes are available at \url{https://github.com/uclaml/SPIN-Diffusion/}.

Cite

Text

Yuan et al. "Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation." Neural Information Processing Systems, 2024. doi:10.52202/079017-2334

Markdown

[Yuan et al. "Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/yuan2024neurips-selfplay/) doi:10.52202/079017-2334

BibTeX

@inproceedings{yuan2024neurips-selfplay,
  title     = {{Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation}},
  author    = {Yuan, Huizhuo and Chen, Zixiang and Ji, Kaixuan and Gu, Quanquan},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-2334},
  url       = {https://mlanthology.org/neurips/2024/yuan2024neurips-selfplay/}
}