Edicho: Consistent Image Editing in the Wild

Abstract

As a verified need, consistent editing across in-the-wild images remains a technical challenge arising from various unmanageable factors, like object poses, lighting conditions, and photography environments. Edicho steps in with a training-free solution based on diffusion models, featuring a fundamental design principle of using explicit image correspondence to direct editing. Specifically, the key components include an attention manipulation module and a carefully refined classifier-free guidance (CFG) denoising strategy, both of which take into account the pre-estimated correspondence. Such an inference-time algorithm enjoys a plug-and-play nature and is compatible to most diffusion-based editing methods, such as ControlNet and BrushNet. Extensive results demonstrate the efficacy of Edicho in consistent cross-image editing under diverse settings.

Cite

Text

Bai et al. "Edicho: Consistent Image Editing in the Wild." International Conference on Computer Vision, 2025.

Markdown

[Bai et al. "Edicho: Consistent Image Editing in the Wild." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/bai2025iccv-edicho/)

BibTeX

@inproceedings{bai2025iccv-edicho,
  title     = {{Edicho: Consistent Image Editing in the Wild}},
  author    = {Bai, Qingyan and Ouyang, Hao and Xu, Yinghao and Wang, Qiuyu and Yang, Ceyuan and Cheng, Ka Leong and Shen, Yujun and Chen, Qifeng},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {15277-15287},
  url       = {https://mlanthology.org/iccv/2025/bai2025iccv-edicho/}
}