A Simple yet Mighty Hartley Diffusion Versatilist for Generalizable Dense Vision Tasks

Abstract

Diffusion models have demonstrated powerful capability as a versatilist for dense vision tasks, yet the generalization ability to unseen domains remains rarely explored. This paper presents HarDiff, an efficient frequency learning scheme, so as to advance generalizable paradigms for diffusion based dense prediction. It draws inspiration from a fine-grained analysis of Discrete Hartley Transform, where some low-frequency features activate the broader content of an image, while some high-frequency features maintain sufficient details for dense pixels. Consequently, HarDiff consists of two key components. The low-frequency training process extracts structural priors from the source domain, to enhance understanding of task-related content. The high-frequency sampling process utilizes detail-oriented guidance from the unseen target domain, to infer precise dense predictions with target-related details. Extensive empirical evidence shows that HarDiff can be easily plugged into various dense vision tasks, e.g., semantic segmentation, depth estimation and haze removal, yielding improvements over the state-of-the-art methods in twelve public benchmarks.

Cite

Text

Bi et al. "A Simple yet Mighty Hartley Diffusion Versatilist for Generalizable Dense Vision Tasks." International Conference on Computer Vision, 2025.

Markdown

[Bi et al. "A Simple yet Mighty Hartley Diffusion Versatilist for Generalizable Dense Vision Tasks." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/bi2025iccv-simple/)

BibTeX

@inproceedings{bi2025iccv-simple,
  title     = {{A Simple yet Mighty Hartley Diffusion Versatilist for Generalizable Dense Vision Tasks}},
  author    = {Bi, Qi and Yi, Jingjun and Huang, Huimin and Zheng, Hao and Zhan, Haolan and Ji, Wei and Huang, Yawen and Li, Yuexiang and Zheng, Yefeng},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {6748-6760},
  url       = {https://mlanthology.org/iccv/2025/bi2025iccv-simple/}
}