Bridging the Gap Between Ideal and Real-World Evaluation: Benchmarking AI-Generated Image Detection in Challenging Scenarios

Abstract

With the rapid advancement of generative models, highly realistic image synthesis has posed new challenges to digital security and media credibility. Although AI-generated image detection methods have partially addressed these concerns, a substantial research gap remains in evaluating their performance under complex real-world conditions. This paper introduces the Real-World Robustness Dataset (RRDataset) for comprehensive evaluation of detection models across three dimensions: 1) Scenario Generalization - RRDataset encompasses high-quality images from seven major scenarios (War & Conflict, Disasters & Accidents, Political & Social Events, Medical & Public Health, Culture & Religion, Labor & Production, and everyday life), addressing existing dataset gaps from a content perspective. 2) Internet Transmission Robustness - examining detector performance on images that have undergone multiple rounds of sharing across various social media platforms.3) Re-digitization Robustness - assessing model effectiveness on images altered through four distinct re-digitization methods.We benchmarked 17 detectors and 10 vision-language models (VLMs) on RRDataset and conducted a large-scale human study involving 192 participants to investigate human few-shot learning capabilities in detecting AI-generated images. The benchmarking results reveal the limitations of current AI detection methods under real-world conditions and underscore the importance of drawing on human adaptability to develop more robust detection algorithms. Our dataset is publicly available at: https://zenodo.org/records/14963880.

Cite

Text

Li et al. "Bridging the Gap Between Ideal and Real-World Evaluation: Benchmarking AI-Generated Image Detection in Challenging Scenarios." International Conference on Computer Vision, 2025.

Markdown

[Li et al. "Bridging the Gap Between Ideal and Real-World Evaluation: Benchmarking AI-Generated Image Detection in Challenging Scenarios." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/li2025iccv-bridging/)

BibTeX

@inproceedings{li2025iccv-bridging,
  title     = {{Bridging the Gap Between Ideal and Real-World Evaluation: Benchmarking AI-Generated Image Detection in Challenging Scenarios}},
  author    = {Li, Chunxiao and Wang, Xiaoxiao and Li, Meiling and Miao, Boming and Sun, Peng and Zhang, Yunjian and Ji, Xiangyang and Zhu, Yao},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {20379-20389},
  url       = {https://mlanthology.org/iccv/2025/li2025iccv-bridging/}
}