VADER: Video Alignment Differencing and Retrieval

Abstract

We propose VADER, a spatio-temporal matching, alignment, and change summarization method to help fight misinformation spread via manipulated videos. VADER matches and coarsely aligns partial video fragments to candidate videos using a robust visual descriptor and scalable search over adaptively chunked video content. A transformer-based alignment module then refines the temporal localization of the query fragment within the matched video. A space-time comparator module identifies regions of manipulation between aligned content, invariant to any changes due to any residual temporal misalignments or artifacts arising from non-editorial changes of the content. Robustly matching video to a trusted source enables conclusions to be drawn on video provenance, enabling informed trust decisions on content encountered. Code and data are available at https://github.com/AlexBlck/vader

Cite

Text

Black et al. "VADER: Video Alignment Differencing and Retrieval." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.02043

Markdown

[Black et al. "VADER: Video Alignment Differencing and Retrieval." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/black2023iccv-vader/) doi:10.1109/ICCV51070.2023.02043

BibTeX

@inproceedings{black2023iccv-vader,
  title     = {{VADER: Video Alignment Differencing and Retrieval}},
  author    = {Black, Alexander and Jenni, Simon and Bui, Tu and Tanjim, Md. Mehrab and Petrangeli, Stefano and Sinha, Ritwik and Swaminathan, Viswanathan and Collomosse, John},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {22357-22367},
  doi       = {10.1109/ICCV51070.2023.02043},
  url       = {https://mlanthology.org/iccv/2023/black2023iccv-vader/}
}