LucidFlux: Caption-Free Photo-Realistic Image Restoration via a Large-Scale Diffusion Transformer

Abstract

Universal image restoration (UIR) aims to recover images degraded by unknown mixtures while preserving semantics—conditions under which discriminative restorers and UNet-based diffusion priors often oversmooth, hallucinate, or drift. We present LucidFlux, a caption-free UIR framework that adapts a large diffusion transformer (Flux.1) to restoration with minimal parameter overhead. LucidFlux introduces a lightweight \emph{dual-branch conditioner} that injects signals from the degraded input and a lightly restored proxy to respectively anchor geometry and suppress artifacts. A timestep- and layer-adaptive modulation schedule routes these cues across the backbone’s hierarchy, yielding coarse-to-fine, context-aware updates that protect global structure while recovering texture. To avoid the latency and instability of text prompts or VLM captions, we enforce \emph{caption-free semantic alignment} via SigLIP features extracted from the proxy. A scalable curation pipeline further filters large-scale data for structure-rich supervision. Across synthetic and in-the-wild benchmarks, LucidFlux consistently surpasses strong open-source and commercial baselines across seven metrics, with clear visual gains in realism, detail, and artifact suppression. Ablations confirm that, for large DiTs, when, where, and what to condition—rather than scaling parameters or relying on text prompts—is the key lever for robust, prompt-free restoration.

Cite

Text

Fei et al. "LucidFlux: Caption-Free Photo-Realistic Image Restoration via a Large-Scale Diffusion Transformer." International Conference on Learning Representations, 2026.

Markdown

[Fei et al. "LucidFlux: Caption-Free Photo-Realistic Image Restoration via a Large-Scale Diffusion Transformer." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/fei2026iclr-lucidflux/)

BibTeX

@inproceedings{fei2026iclr-lucidflux,
  title     = {{LucidFlux: Caption-Free Photo-Realistic Image Restoration via a Large-Scale Diffusion Transformer}},
  author    = {Fei, Song and Ye, Tian and Wang, Lujia and Zhu, Lei},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/fei2026iclr-lucidflux/}
}