Multi-Modal Aerial View Image Challenge: Sensor Domain Translation

Abstract

This paper describes the design, outcomes, and top methods of the 2nd annual Multi-modal Aerial View Image Challenge (MAVIC) aimed at cross modality aerial image translation. The primary objective of this competition is to stimulate research efforts towards the development of models capable of translating co-aligned images between multiple modalities. Specifically, the challenge centers on translation between synthetic aperture radar (SAR), electro-optical (EO), camera (RGB), and infrared (IR) sensor modalities, a budding area of research that has begun to garner attention. While last year’s inaugural challenge demonstrated the feasibility of SAR→EO translation, this year’s challenge made significant improvements in dataset coverage, sensor variation, experimental design, and methods covering the tasks of SAR→EO, SAR→RGB, SAR→IR, RGB→IR introducing a new dataset called translation. By Multi-modal Aerial Gathered Image Composites (MAGIC); multimodal image translation is available for different comparisons. With a more rigorous set of translation performance metrics, winners were determined from aggregation of L1-norm, LPIPS (Learned Perceptual Image Patch Similarity, and FID (Frechet Inception Distance) scores. The wining methods included the pix2pixHD and LPIPS metrics as loss functions with an aggregated score 5% better separated by the SAR→EO and RGB→IR translation scores.

Cite

Text

Low et al. "Multi-Modal Aerial View Image Challenge: Sensor Domain Translation." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024. doi:10.1109/CVPRW63382.2024.00315

Markdown

[Low et al. "Multi-Modal Aerial View Image Challenge: Sensor Domain Translation." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024.](https://mlanthology.org/cvprw/2024/low2024cvprw-multimodal/) doi:10.1109/CVPRW63382.2024.00315

BibTeX

@inproceedings{low2024cvprw-multimodal,
  title     = {{Multi-Modal Aerial View Image Challenge: Sensor Domain Translation}},
  author    = {Low, Spencer and Nina, Oliver and Bowald, Dylan and Sappa, Angel Domingo and Inkawhich, Nathan and Bruns, Peter},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2024},
  pages     = {3096-3104},
  doi       = {10.1109/CVPRW63382.2024.00315},
  url       = {https://mlanthology.org/cvprw/2024/low2024cvprw-multimodal/}
}