UltraEdit: Instruction-Based Fine-Grained Image Editing at Scale
Abstract
This paper presents UltraEdit, a large-scale (~ 4M editing samples), automatically generated dataset for instruction-based image editing. Our key idea is to address the drawbacks in existing image editing datasets like InstructPix2Pix and MagicBrush, and provide a systematic approach to producing massive and high-quality image editing samples: 1) UltraEdit includes more diverse editing instructions by combining LLM creativity and in-context editing examples by human raters; 2) UltraEdit is anchored on real images (photographs or artworks), which offers more diversity and less biases than those purely synthesized by text-to-image models; 3) UltraEdit supports region-based editing with high-quality, automatically produced region annotations. Our experiments show that canonical diffusion-based editing baselines trained on UltraEdit set new records on challenging MagicBrush and Emu-Edit benchmarks, respectively. Our analysis further confirms the crucial role of real image anchors and region-based editing data. The dataset, code, and models will be made public.
Cite
Text
Zhao et al. "UltraEdit: Instruction-Based Fine-Grained Image Editing at Scale." Neural Information Processing Systems, 2024. doi:10.52202/079017-0100Markdown
[Zhao et al. "UltraEdit: Instruction-Based Fine-Grained Image Editing at Scale." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/zhao2024neurips-ultraedit/) doi:10.52202/079017-0100BibTeX
@inproceedings{zhao2024neurips-ultraedit,
title = {{UltraEdit: Instruction-Based Fine-Grained Image Editing at Scale}},
author = {Zhao, Haozhe and Ma, Xiaojian and Chen, Liang and Si, Shuzheng and Wu, Rujie and An, Kaikai and Yu, Peiyu and Zhang, Minjia and Li, Qing and Chang, Baobao},
booktitle = {Neural Information Processing Systems},
year = {2024},
doi = {10.52202/079017-0100},
url = {https://mlanthology.org/neurips/2024/zhao2024neurips-ultraedit/}
}