M2SFormer: Multi-Spectral and Multi-Scale Attention with Edge-Aware Difficulty Guidance for Image Forgery Localization

Ju-Hyeon Nam, Dong-Hyun Moon, Sang-Chul Lee

ICCV 2025 pp. 15927-15938

/iccv/2025/nam2025iccv-m2sformer/

Abstract

Image editing techniques have rapidly advanced, facilitating both innovative use cases and malicious manipulation of digital images. Deep learning-based methods have recently achieved high accuracy in pixel-level forgery localization, yet they frequently struggle with computational overhead and limited representation power, particularly for subtle or complex tampering. In this paper, we propose M2SFormer, a novel Transformer encoder-based framework designed to overcome these challenges. Unlike approaches that process spatial and frequency cues separately, M2SFormer unifies multi-frequency and multi-scale attentions in the skip connection, harnessing global context to better capture diverse forgery artifacts. Additionally, our framework addresses the loss of fine detail during upsampling by utilizing a global prior map--a curvature metric indicating the difficulty of forgery localization--which then guides a difficulty-guided attention module to preserve subtle manipulations more effectively. Extensive experiments on multiple benchmark datasets demonstrate that M2SFormer outperforms existing state-of-the-art models, offering superior generalization in detecting and localizing forgeries across unseen domains. Our M2SFormer code is available in Github Link.

PDF ICCV Semantic Scholar

Cite

Text

Nam et al. "M2SFormer: Multi-Spectral and Multi-Scale Attention with Edge-Aware Difficulty Guidance for Image Forgery Localization." International Conference on Computer Vision, 2025.

Markdown

[Nam et al. "M2SFormer: Multi-Spectral and Multi-Scale Attention with Edge-Aware Difficulty Guidance for Image Forgery Localization." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/nam2025iccv-m2sformer/)

BibTeX

@inproceedings{nam2025iccv-m2sformer,
  title     = {{M2SFormer: Multi-Spectral and Multi-Scale Attention with Edge-Aware Difficulty Guidance for Image Forgery Localization}},
  author    = {Nam, Ju-Hyeon and Moon, Dong-Hyun and Lee, Sang-Chul},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {15927-15938},
  url       = {https://mlanthology.org/iccv/2025/nam2025iccv-m2sformer/}
}