MCM: Multi-Condition Motion Synthesis Framework

Abstract

Generative Image Tampering (GIT), due to its high diversity and realism, poses a significant challenge to traditional image tampering localization techniques. Consequently, this paper introduces a denoising diffusion probabilistic model-based DcDsDiff, which comprises a Dual-View Conditional Network (DVCN) and a Dual-Stream Denoising Network (DSDN). DVCN provides clues about the tampered areas. It extracts tampering features in the high-frequency view and integrates them with spatial domain features using attention mechanisms. DSDN jointly generates mask image and detail image, enhancing the generalization capability of the model against new tampering forms through iterative denoising. A multi-stream interaction mechanism enables the two generative tasks to promote each other, prompting the model to generate localization results that are rich in detail and complete. Experiments show that DcDsDiff outperforms mainstream methods in accurate localization, generalization, extensibility, and robustness. Code page: https://github.com/QixianHao/DcDsDiff-and-GIT10K.

Cite

Text

Ling et al. "MCM: Multi-Condition Motion Synthesis Framework." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/120

Markdown

[Ling et al. "MCM: Multi-Condition Motion Synthesis Framework." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/ling2024ijcai-mcm/) doi:10.24963/ijcai.2024/120

BibTeX

@inproceedings{ling2024ijcai-mcm,
  title     = {{MCM: Multi-Condition Motion Synthesis Framework}},
  author    = {Ling, Zeyu and Han, Bo and Wong, Yongkang and Lin, Han and Kankanhalli, Mohan S. and Geng, Weidong},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {1083-1091},
  doi       = {10.24963/ijcai.2024/120},
  url       = {https://mlanthology.org/ijcai/2024/ling2024ijcai-mcm/}
}