DiffusionSat: A Generative Foundation Model for Satellite Imagery

Abstract

Diffusion models have achieved state-of-the-art results on many modalities including images, speech, and video. However, existing models are not tailored to support remote sensing data, which is widely used in important applications including environmental monitoring and crop-yield prediction. Satellite images are significantly different from natural images -- they can be multi-spectral, irregularly sampled across time -- and existing diffusion models trained on images from the Web do not support them. Furthermore, remote sensing data is inherently spatio-temporal, requiring conditional generation tasks not supported by traditional methods based on captions or images. In this paper, we present DiffusionSat, to date the largest generative foundation model trained on a collection of publicly available large, high-resolution remote sensing datasets . As text-based captions are sparsely available for satellite images, we incorporate the associated metadata such as geolocation as conditioning information. Our method produces realistic samples and can be used to solve multiple generative tasks including temporal generation, multi-spectral superrresolution and in-painting. Our method outperforms previous state-of-the-art methods for satellite image generation and is the first large-scale _generative_ foundation model for satellite imagery. The project website can be found here: https://samar-khanna.github.io/DiffusionSat/

Cite

Text

Khanna et al. "DiffusionSat: A Generative Foundation Model for Satellite Imagery." International Conference on Learning Representations, 2024.

Markdown

[Khanna et al. "DiffusionSat: A Generative Foundation Model for Satellite Imagery." International Conference on Learning Representations, 2024.](https://mlanthology.org/iclr/2024/khanna2024iclr-diffusionsat/)

BibTeX

@inproceedings{khanna2024iclr-diffusionsat,
  title     = {{DiffusionSat: A Generative Foundation Model for Satellite Imagery}},
  author    = {Khanna, Samar and Liu, Patrick and Zhou, Linqi and Meng, Chenlin and Rombach, Robin and Burke, Marshall and Lobell, David B. and Ermon, Stefano},
  booktitle = {International Conference on Learning Representations},
  year      = {2024},
  url       = {https://mlanthology.org/iclr/2024/khanna2024iclr-diffusionsat/}
}