Responsible Visual Editing

Abstract

With the recent advancements in visual synthesis, there is a growing risk of encountering synthesized images with detrimental effects, such as hate, discrimination, and privacy violations. Unfortunately, it remains unexplored on how to avoid synthesizing harmful images and convert them into responsible ones. In this paper, we present responsible visual editing, which edits risky concepts within an image to more responsible ones with minimal content changes. However, the concepts that need to be edited are often abstract, making them hard to be located and edited. To tackle these challenges, we propose a Cognitive Editor (CoEditor) by harnessing the large multimodal models through a two-stage cognitive process: (1) a perceptual cognitive process to locate what to be edited and (2) a behavioral cognitive process to strategize how to edit. To mitigate the negative implications of harmful images on research, we build a transparent and public dataset, namely AltBear, which expresses harmful information using teddy bears instead of humans. Experiments demonstrate that CoEditor can effectively comprehend abstract concepts in complex scenes, significantly surpassing the baseline models for responsible visual editing. Moreover, we find that the AltBear dataset corresponds well to the harmful content found in real images, providing a safe and effective benchmark for future research. Our source code and dataset can be found at https://github.com/kodenii/ Responsible-Visual-Editing.

Cite

Text

Ni et al. "Responsible Visual Editing." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72670-5_18

Markdown

[Ni et al. "Responsible Visual Editing." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/ni2024eccv-responsible/) doi:10.1007/978-3-031-72670-5_18

BibTeX

@inproceedings{ni2024eccv-responsible,
  title     = {{Responsible Visual Editing}},
  author    = {Ni, Minheng and Shen, Yeli and Zhang, Lei and Zuo, Wangmeng},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-72670-5_18},
  url       = {https://mlanthology.org/eccv/2024/ni2024eccv-responsible/}
}