GeoSynth: Contextually-Aware High-Resolution Satellite Image Synthesis
Abstract
We present GeoSynth, a model for synthesizing satellite images with global style and image-driven layout control. The global style control is via textual prompts or geographic location. These enable the specification of scene semantics or regional appearance respectively, and can be used together. We train our model on a large dataset of paired satellite imagery, with automatically generated captions, and OpenStreetMap data. We evaluate various combinations of control inputs, including different types of layout controls. Results demonstrate that our model can generate diverse, high-quality images and exhibits excellent zero-shot generalization. The code and model check-points are available at https://github.com/mvrl/GeoSynth.
Cite
Text
Sastry et al. "GeoSynth: Contextually-Aware High-Resolution Satellite Image Synthesis." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024. doi:10.1109/CVPRW63382.2024.00051Markdown
[Sastry et al. "GeoSynth: Contextually-Aware High-Resolution Satellite Image Synthesis." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024.](https://mlanthology.org/cvprw/2024/sastry2024cvprw-geosynth/) doi:10.1109/CVPRW63382.2024.00051BibTeX
@inproceedings{sastry2024cvprw-geosynth,
title = {{GeoSynth: Contextually-Aware High-Resolution Satellite Image Synthesis}},
author = {Sastry, Srikumar and Khanal, Subash and Dhakal, Aayush and Jacobs, Nathan},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2024},
pages = {460-470},
doi = {10.1109/CVPRW63382.2024.00051},
url = {https://mlanthology.org/cvprw/2024/sastry2024cvprw-geosynth/}
}