Sparse and Complete Latent Organization for Geospatial Semantic Segmentation

Abstract

Geospatial semantic segmentation on remote sensing images suffers from large intra-class variance in both foreground and background classes. First, foreground objects are tiny in the remote sensing images and are represented by only a few pixels, which leads to large foreground intra-class variance and undermines the discrimination between foreground classes (issue firstly considered in this work). Second, background class contains complex context, which results in false alarms due to large background intra-class variance. To alleviate these two issues, we construct a sparse and complete latent structure via prototypes. In particular, to enhance the sparsity of the latent space, we design a prototypical contrastive learning to have prototypes of the same category clustering together and prototypes of different categories to be far away from each other. Also, we strengthen the completeness of the latent space by modeling all foreground categories and hardest (nearest) background objects. We further design a patch shuffle augmentation for remote sensing images with complicated contexts. Our augmentation encourages the semantic information of an object to be correlated only to the limited context within the patch that is specific to its category, which further reduces large intra-class variance. We conduct extensive evaluations on a large scale remote sensing dataset, showing our approach significantly outperforms state-of-the-art methods by a large margin.

Cite

Text

Yang and Ma. "Sparse and Complete Latent Organization for Geospatial Semantic Segmentation." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.00185

Markdown

[Yang and Ma. "Sparse and Complete Latent Organization for Geospatial Semantic Segmentation." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/yang2022cvpr-sparse/) doi:10.1109/CVPR52688.2022.00185

BibTeX

@inproceedings{yang2022cvpr-sparse,
  title     = {{Sparse and Complete Latent Organization for Geospatial Semantic Segmentation}},
  author    = {Yang, Fengyu and Ma, Chenyang},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2022},
  pages     = {1809-1818},
  doi       = {10.1109/CVPR52688.2022.00185},
  url       = {https://mlanthology.org/cvpr/2022/yang2022cvpr-sparse/}
}