Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets

Zhen Liu, Tim Z. Xiao, Weiyang Liu, Yoshua Bengio, Dinghuai Zhang

ICLR 2025

/iclr/2025/liu2025iclr-efficient/

Abstract

While one commonly trains large diffusion models by collecting datasets on target downstream tasks, it is often desired to align and finetune pretrained diffusion models with some reward functions that are either designed by experts or learned from small-scale datasets. Existing post-training methods for reward finetuning of diffusion models typically suffer from lack of diversity in generated samples, lack of prior preservation, and/or slow convergence in finetuning. In response to this challenge, we take inspiration from recent successes in generative flow networks (GFlowNets) and propose a reinforcement learning method for diffusion model finetuning, dubbed Nabla-GFlowNet (abbreviated as $\nabla$-GFlowNet), that leverages the rich signal in reward gradients for probabilistic diffusion finetuning. We show that our proposed method achieves fast yet diversity- and prior-preserving finetuning of Stable Diffusion, a large-scale text-conditioned image diffusion model, on different realistic reward functions.

PDF ICLR Semantic Scholar

Cite

Text

Liu et al. "Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets." International Conference on Learning Representations, 2025.

Markdown

[Liu et al. "Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/liu2025iclr-efficient/)

BibTeX

@inproceedings{liu2025iclr-efficient,
  title     = {{Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets}},
  author    = {Liu, Zhen and Xiao, Tim Z. and Liu, Weiyang and Bengio, Yoshua and Zhang, Dinghuai},
  booktitle = {International Conference on Learning Representations},
  year      = {2025},
  url       = {https://mlanthology.org/iclr/2025/liu2025iclr-efficient/}
}