Detecting AI-Synthesized Speech Using Bispectral Analysis
Abstract
From speech to images, and videos, advances in machine learning have led to dramatic improvements in the quality and realism of so-called AI-synthesized content. While there are many exciting and interesting applications, this type of content can also be used to create convincing and dangerous fakes. We seek to develop forensic techniques that can distinguish a real human voice from synthesized voice. We observe that deep neural networks used to synthesize speech introduce specific and unusual spectral correlations not typically found in human speech. Although not necessarily audible, these correlations can be measured using tools from bispectral analysis and used to distinguish human from synthesized speech.
Cite
Text
AlBadawy et al. "Detecting AI-Synthesized Speech Using Bispectral Analysis." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019.Markdown
[AlBadawy et al. "Detecting AI-Synthesized Speech Using Bispectral Analysis." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019.](https://mlanthology.org/cvprw/2019/albadawy2019cvprw-detecting/)BibTeX
@inproceedings{albadawy2019cvprw-detecting,
title = {{Detecting AI-Synthesized Speech Using Bispectral Analysis}},
author = {AlBadawy, Ehab A. and Lyu, Siwei and Farid, Hany},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2019},
pages = {104-109},
url = {https://mlanthology.org/cvprw/2019/albadawy2019cvprw-detecting/}
}