Preserve Anything: Controllable Image Synthesis with Object Preservation

Abstract

We introduce Preserve Anything, a novel method for con-trolled image synthesis that addresses key limitations in ob-ject preservation and semantic consistency in text-to-image(T2I) generation. Existing approaches often fail (i) to pre-serve multiple objects with fidelity, (ii) maintain semanticalignment with prompts, or (iii) provide explicit control overscene composition. To overcome these challenges, the pro-posed method employs an N-channel ControlNet that inte-grates (i) object preservation with size and placement ag-nosticism, color and detail retention, and artifact elimi-nation, (ii) high-resolution, semantically consistent back-grounds with accurate shadows, lighting, and prompt ad-herence, and (iii) explicit user control over background lay-outs and lighting conditions. Key components of our frame-work include object preservation and background guid-ance modules, enforcing lighting consistency and a high-frequency overlay module to retain fine details while mit-igating unwanted artifacts. We introduce a benchmarkdataset consisting of 240K natural images filtered for aes-thetic quality and 18K 3D-rendered synthetic images withmetadata such as lighting, camera angles, and object rela-tionships. This dataset addresses the deficiencies of existingbenchmarks and allows a complete evaluation. Empiricalresults demonstrate that our method achieves state-of-the-art performance, significantly improving feature-space fi-delity (FID 15.26) and semantic alignment (CLIP-S 32.85)while maintaining competitive aesthetic quality. We alsoconducted a user study to demonstrate the efficacy of theproposed work on unseen benchmark and observed a re-markable improvement of 25%, 19%, 13%, and 14% in terms of prompt alignment, photorealism, thepresence of AI artifacts, and natural aesthetics over existingworks.

Cite

Text

Sharma et al. "Preserve Anything: Controllable Image Synthesis with Object Preservation." International Conference on Computer Vision, 2025.

Markdown

[Sharma et al. "Preserve Anything: Controllable Image Synthesis with Object Preservation." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/sharma2025iccv-preserve/)

BibTeX

@inproceedings{sharma2025iccv-preserve,
  title     = {{Preserve Anything: Controllable Image Synthesis with Object Preservation}},
  author    = {Sharma, Prasen Kumar and Matiyali, Neeraj and Srivastava, Siddharth and Sharma, Gaurav},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {18058-18067},
  url       = {https://mlanthology.org/iccv/2025/sharma2025iccv-preserve/}
}