Seeing What Matters: Generalizable AI-Generated Video Detection with Forensic-Oriented Augmentation

Corvi, Riccardo; Cozzolino, Davide; Prashnani, Ekta; De Mello, Shalini; Nagano, Koki; Verdoliva, Luisa

Seeing What Matters: Generalizable AI-Generated Video Detection with Forensic-Oriented Augmentation

Riccardo Corvi, Davide Cozzolino, Ekta Prashnani, Shalini De Mello, Koki Nagano, Luisa Verdoliva

NeurIPS 2025

/neurips/2025/corvi2025neurips-seeing/

Abstract

Synthetic video generation is progressing very rapidly. The latest models can produce very realistic high-resolution videos that are virtually indistinguishable from real ones. Although several video forensic detectors have been recently proposed, they often exhibit poor generalization, which limits their applicability in a real-world scenario. Our key insight to overcome this issue is to guide the detector towards _seeing_ _what_ _really_ _matters_. In fact, a well-designed forensic classifier should focus on identifying intrinsic low-level artifacts introduced by a generative architecture rather than relying on high-level semantic flaws that characterize a specific model. In this work, first, we study different generative architectures, searching and identifying discriminative features that are unbiased, robust to impairments, and shared across models. Then, we introduce a novel forensic-oriented data augmentation strategy based on the wavelet decomposition and replace specific frequency-related bands to drive the model to exploit more relevant forensic cues. Our novel training paradigm improves the generalizability of AI-generated video detectors, without the need for complex algorithms and large datasets that include multiple synthetic generators. To evaluate our approach, we train the detector using data from a single generative model and test it against videos produced by a wide range of other models. Despite its simplicity, our method achieves a significant accuracy improvement over state-of-the-art detectors and obtains excellent results even on very recent generative models, such as NOVA and FLUX.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Corvi et al. "Seeing What Matters: Generalizable AI-Generated Video Detection with Forensic-Oriented Augmentation." Advances in Neural Information Processing Systems, 2025.

Markdown

[Corvi et al. "Seeing What Matters: Generalizable AI-Generated Video Detection with Forensic-Oriented Augmentation." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/corvi2025neurips-seeing/)

BibTeX

@inproceedings{corvi2025neurips-seeing,
  title     = {{Seeing What Matters: Generalizable AI-Generated Video Detection with Forensic-Oriented Augmentation}},
  author    = {Corvi, Riccardo and Cozzolino, Davide and Prashnani, Ekta and De Mello, Shalini and Nagano, Koki and Verdoliva, Luisa},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/corvi2025neurips-seeing/}
}