Deconstructing Guidance: A Semantic Hierarchy for Precise Diffusion Model Editing

Abstract

Text-guided image editing requires more than prompt following—it demands a principled understanding of what to modify versus what to preserve. We investigate the internal guidance mechanism of diffusion models and reveal that the guidance signal follows a structured semantic hierarchy. We formalize this insight as the Semantic Scale Hypothesis: the magnitude of the guidance difference vector ($\Delta\boldsymbol{\epsilon}$) directly encodes the semantic scale of edits. Crucially, this phenomenon is theoretically grounded in Tweedie’s formula, which links score prediction to the variance of the underlying data distribution. Low-variance regions, such as objects, yield large-magnitude differences corresponding to structural edits, whereas high-variance regions, such as backgrounds, yield small-magnitude differences corresponding to stylistic adjustments. Building on this principle, we introduce Prism-Edit, a training-free, plug-and-play module that decomposes the guidance signal into semantic layers, enabling selective and interpretable control. Extensive experiments—spanning direct visualization of the semantic hierarchy, generalization across foundation models, and integration with state-of-the-art editors—demonstrate that Prism-Edit achieves precise, robust, and controllable editing. Our findings establish semantic scale as a foundational axis for understanding and advancing diffusion-based image editing.

Cite

Text

Jeong et al. "Deconstructing Guidance: A Semantic Hierarchy for Precise Diffusion Model Editing." International Conference on Learning Representations, 2026.

Markdown

[Jeong et al. "Deconstructing Guidance: A Semantic Hierarchy for Precise Diffusion Model Editing." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/jeong2026iclr-deconstructing/)

BibTeX

@inproceedings{jeong2026iclr-deconstructing,
  title     = {{Deconstructing Guidance: A Semantic Hierarchy for Precise Diffusion Model Editing}},
  author    = {Jeong, Wootaek and Sohn, Junghyo and Yoon, Jee Seok and Suk, Heung-Il},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/jeong2026iclr-deconstructing/}
}