Detecting Deep-Fake Videos from Phoneme-Viseme Mismatches
Abstract
Recent advances in machine learning and computer graphics have made it easier to convincingly manipulate video and audio. These so-called deep-fake videos range from complete full-face synthesis and replacement (face-swap), to complete mouth and audio synthesis and replacement (lip-sync), and partial word-based audio and mouth synthesis and replacement. Detection of deep fakes with only a small spatial and temporal manipulation is particularly challenging. We describe a technique to detect such manipulated videos by exploiting the fact that the dynamics of the mouth shape – visemes – are occasionally inconsistent with a spoken phoneme. We focus on the visemes associated with words having the sound M (mama), B (baba), or P (papa) in which the mouth must completely close in order to pronounce these phonemes. We observe that this is not the case in many deep-fake videos. Such phoneme-viseme mismatches can, therefore, be used to detect even spatially small and temporally localized manipulations. We demonstrate the efficacy and robustness of this approach to detect different types of deep-fake videos, including in-the-wild deep fakes.
Cite
Text
Agarwal et al. "Detecting Deep-Fake Videos from Phoneme-Viseme Mismatches." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020. doi:10.1109/CVPRW50498.2020.00338Markdown
[Agarwal et al. "Detecting Deep-Fake Videos from Phoneme-Viseme Mismatches." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020.](https://mlanthology.org/cvprw/2020/agarwal2020cvprw-detecting/) doi:10.1109/CVPRW50498.2020.00338BibTeX
@inproceedings{agarwal2020cvprw-detecting,
title = {{Detecting Deep-Fake Videos from Phoneme-Viseme Mismatches}},
author = {Agarwal, Shruti and Farid, Hany and Fried, Ohad and Agrawala, Maneesh},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2020},
pages = {2814-2822},
doi = {10.1109/CVPRW50498.2020.00338},
url = {https://mlanthology.org/cvprw/2020/agarwal2020cvprw-detecting/}
}