Generated Audio Detectors Are Not Robust in Real-World Conditions
Abstract
The misuse of generative AI (genAI) has raised significant ethical and trust issues. To mitigate this, substantial focus has been placed on detecting generated media, including fake audio. In this paper, we examine the efficacy of state-of-the-art fake audio detection methods under real-world conditions. By analyzing typical audio alterations of transmission pipelines, we identify several vulnerabilities: (1) minimal changes such as sound level variations can bias detection performance, (2) inevitable physical effects such as background noise lead to classifier failures, (3) classifiers struggle to generalize across different datasets, and (4) network degradation affects the overall detection performance. Our results indicate that existing detectors have major issues in differentiating between real and fake audio in practical applications and that significant improvements are still necessary for reliable detection in real-world environments.
Cite
Text
Shaw et al. "Generated Audio Detectors Are Not Robust in Real-World Conditions." ICML 2024 Workshops: NextGenAISafety, 2024.Markdown
[Shaw et al. "Generated Audio Detectors Are Not Robust in Real-World Conditions." ICML 2024 Workshops: NextGenAISafety, 2024.](https://mlanthology.org/icmlw/2024/shaw2024icmlw-generated/)BibTeX
@inproceedings{shaw2024icmlw-generated,
title = {{Generated Audio Detectors Are Not Robust in Real-World Conditions}},
author = {Shaw, Soumya and Nassi, Ben and Schönherr, Lea},
booktitle = {ICML 2024 Workshops: NextGenAISafety},
year = {2024},
url = {https://mlanthology.org/icmlw/2024/shaw2024icmlw-generated/}
}