SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset

Dai, Juntao; Chen, Tianle; Wang, Xuyao; Yang, Ziran; Chen, Taiye; Ji, Jiaming; Yang, Yaodong

doi:10.52202/079017-0546

SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset

Juntao Dai, Tianle Chen, Xuyao Wang, Ziran Yang, Taiye Chen, Jiaming Ji, Yaodong Yang

NeurIPS 2024

doi:10.52202/079017-0546 /neurips/2024/dai2024neurips-safesora/

Abstract

To mitigate the risk of harmful outputs from large vision models (LVMs), we introduce the SafeSora dataset to promote research on aligning text-to-video generation with human values. This dataset encompasses human preferences in text-to-video generation tasks along two primary dimensions: helpfulness and harmlessness. To capture in-depth human preferences and facilitate structured reasoning by crowdworkers, we subdivide helpfulness into 4 sub-dimensions and harmlessness into 12 sub-categories, serving as the basis for pilot annotations. The SafeSora dataset includes 14,711 unique prompts, 57,333 unique videos generated by 4 distinct LVMs, and 51,691 pairs of preference annotations labeled by humans. We further demonstrate the utility of the SafeSora dataset through several applications, including training the text-video moderation model and aligning LVMs with human preference by fine-tuning a prompt augmentation module or the diffusion model. These applications highlight its potential as the foundation for text-to-video alignment research, such as human preference modeling and the development and validation of alignment algorithms. Our project is available at https://sites.google.com/view/safe-sora. Warning: this paper contains example data that may be offensive or harmful.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Dai et al. "SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset." Neural Information Processing Systems, 2024. doi:10.52202/079017-0546

Markdown

[Dai et al. "SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/dai2024neurips-safesora/) doi:10.52202/079017-0546

BibTeX

@inproceedings{dai2024neurips-safesora,
  title     = {{SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset}},
  author    = {Dai, Juntao and Chen, Tianle and Wang, Xuyao and Yang, Ziran and Chen, Taiye and Ji, Jiaming and Yang, Yaodong},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-0546},
  url       = {https://mlanthology.org/neurips/2024/dai2024neurips-safesora/}
}