∞-Brush: Controllable Large Image Synthesis with Diffusion Models in Infinite Dimensions
Abstract
Synthesizing high-resolution images from intricate, domain-specific information remains a significant challenge in generative modeling, particularly for applications in large-image domains such as digital histopathology and remote sensing. Existing methods face critical limitations: conditional diffusion models in pixel or latent space cannot exceed the resolution on which they were trained without losing fidelity, and computational demands increase significantly for larger image sizes. Patch-based methods offer computational efficiency but fail to capture long-range spatial relationships due to their overreliance on local information. In this paper, we introduce a novel conditional diffusion model in infinite dimensions, ∞-Brush for controllable large image synthesis. We propose a cross-attention neural operator to enable conditioning in function space. Our model overcomes the constraints of traditional finite-dimensional diffusion models and patch-based methods, offering scalability and superior capability in preserving global image structures while maintaining fine details. To our best knowledge, ∞-Brush is the first conditional diffusion model in function space, that can controllably synthesize images at arbitrary resolutions of up to 4096 × 4096 pixels. The code is available at https://github.com/cvlab-stonybrook/infinity-brush.
Cite
Text
Le et al. "∞-Brush: Controllable Large Image Synthesis with Diffusion Models in Infinite Dimensions." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-73411-3_22Markdown
[Le et al. "∞-Brush: Controllable Large Image Synthesis with Diffusion Models in Infinite Dimensions." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/le2024eccv-brush/) doi:10.1007/978-3-031-73411-3_22BibTeX
@inproceedings{le2024eccv-brush,
title = {{∞-Brush: Controllable Large Image Synthesis with Diffusion Models in Infinite Dimensions}},
author = {Le, Minh-Quan and Graikos, Alexandros and Yellapragada, Srikar and Gupta, Rajarsi and Saltz, Joel and Samaras, Dimitris},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2024},
doi = {10.1007/978-3-031-73411-3_22},
url = {https://mlanthology.org/eccv/2024/le2024eccv-brush/}
}