Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation
Abstract
Fine-tuning Diffusion Models remains an underexplored frontier in generative artificial intelligence (GenAI), especially when compared with the remarkable progress made in fine-tuning Large Language Models (LLMs). While cutting-edge diffusion models such as Stable Diffusion (SD) and SDXL rely on supervised fine-tuning, their performance inevitably plateaus after seeing a certain volume of data. Recently, reinforcement learning (RL) has been employed to fine-tune diffusion models with human preference data, but it requires at least two images (winner'' andloser'' images) for each text prompt.In this paper, we introduce an innovative technique called self-play fine-tuning for diffusion models (SPIN-Diffusion), where the diffusion model engages in competition with its earlier versions, facilitating an iterative self-improvement process. Our approach offers an alternative to conventional supervised fine-tuning and RL strategies, significantly improving both model performance and alignment. Our experiments on the Pick-a-Pic dataset reveal that SPIN-Diffusion outperforms the existing supervised fine-tuning method in aspects of human preference alignment and visual appeal right from its first iteration. By the second iteration, it exceeds the performance of RLHF-based methods across all metrics, achieving these results with less data. Codes are available at \url{https://github.com/uclaml/SPIN-Diffusion/}.
Cite
Text
Yuan et al. "Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation." Neural Information Processing Systems, 2024. doi:10.52202/079017-2334Markdown
[Yuan et al. "Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/yuan2024neurips-selfplay/) doi:10.52202/079017-2334BibTeX
@inproceedings{yuan2024neurips-selfplay,
title = {{Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation}},
author = {Yuan, Huizhuo and Chen, Zixiang and Ji, Kaixuan and Gu, Quanquan},
booktitle = {Neural Information Processing Systems},
year = {2024},
doi = {10.52202/079017-2334},
url = {https://mlanthology.org/neurips/2024/yuan2024neurips-selfplay/}
}