MCM: Multi-Condition Motion Synthesis Framework
Abstract
Generative Image Tampering (GIT), due to its high diversity and realism, poses a significant challenge to traditional image tampering localization techniques. Consequently, this paper introduces a denoising diffusion probabilistic model-based DcDsDiff, which comprises a Dual-View Conditional Network (DVCN) and a Dual-Stream Denoising Network (DSDN). DVCN provides clues about the tampered areas. It extracts tampering features in the high-frequency view and integrates them with spatial domain features using attention mechanisms. DSDN jointly generates mask image and detail image, enhancing the generalization capability of the model against new tampering forms through iterative denoising. A multi-stream interaction mechanism enables the two generative tasks to promote each other, prompting the model to generate localization results that are rich in detail and complete. Experiments show that DcDsDiff outperforms mainstream methods in accurate localization, generalization, extensibility, and robustness. Code page: https://github.com/QixianHao/DcDsDiff-and-GIT10K.
Cite
Text
Ling et al. "MCM: Multi-Condition Motion Synthesis Framework." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/120Markdown
[Ling et al. "MCM: Multi-Condition Motion Synthesis Framework." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/ling2024ijcai-mcm/) doi:10.24963/ijcai.2024/120BibTeX
@inproceedings{ling2024ijcai-mcm,
title = {{MCM: Multi-Condition Motion Synthesis Framework}},
author = {Ling, Zeyu and Han, Bo and Wong, Yongkang and Lin, Han and Kankanhalli, Mohan S. and Geng, Weidong},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2024},
pages = {1083-1091},
doi = {10.24963/ijcai.2024/120},
url = {https://mlanthology.org/ijcai/2024/ling2024ijcai-mcm/}
}