Tuning-Free Alignment of Diffusion Models with Direct Noise Optimization

Abstract

In this work, we focus on the alignment problem of diffusion models with a continuous reward function, which represents specific objectives for downstream tasks, such as improving human preference. The central goal of the alignment problem is to adjust the distribution learned by diffusion models such that the generated samples maximize the target reward function. We propose a novel alignment approach, named Direct Noise Optimization (DNO), that optimizes the injected noise during the sampling process of diffusion models. By design, DNO is tuning-free and prompt-agnostic, as the alignment occurs in an online fashion during generation. We rigorously study the theoretical properties of DNO and also empiricially identify that naive implementation of DNO occasionally suffers from the out-of-distribution reward hacking problem, where optimized samples have high rewards but are no longer in the support of the pretrained distribution. To remedy this issue, we leverage classical high-dimensional statistics theory and propose to augment the DNO loss with certain probability regularization. We conduct a novel experiment to verify that DNO is an effective tuning-free approach for aligning diffusion models, and our proposed regularization can indeed prevent the out-of-distribution reward hacking problem.

Cite

Text

Tang et al. "Tuning-Free Alignment of  Diffusion Models with Direct Noise Optimization." ICML 2024 Workshops: SPIGM, 2024.

Markdown

[Tang et al. "Tuning-Free Alignment of  Diffusion Models with Direct Noise Optimization." ICML 2024 Workshops: SPIGM, 2024.](https://mlanthology.org/icmlw/2024/tang2024icmlw-tuningfree/)

BibTeX

@inproceedings{tang2024icmlw-tuningfree,
  title     = {{Tuning-Free Alignment of  Diffusion Models with Direct Noise Optimization}},
  author    = {Tang, Zhiwei and Peng, Jiangweizhi and Tang, Jiasheng and Hong, Mingyi and Wang, Fan and Chang, Tsung-Hui},
  booktitle = {ICML 2024 Workshops: SPIGM},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/tang2024icmlw-tuningfree/}
}