A Training-Free Synthetic Data Selection Method for Semantic Segmentation

Abstract

Training semantic segmenter with synthetic data has been attracting great attention due to its easy accessibility and huge quantities. Most previous methods focused on producing large-scale synthetic image-annotation samples and then training the segmenter with all of them. However, such a solution remains a main challenge in that the poor-quality samples are unavoidable, and using them to train the model will damage the training process. In this paper, we propose a training-free Synthetic Data Selection (SDS) strategy with CLIP to select high-quality samples for building a reliable synthetic dataset. Specifically, given massive synthetic image-annotation pairs, we first design a Perturbation-based CLIP Similarity (PCS) to measure the reliability of synthetic image, thus removing samples with low-quality images. Then we propose a class-balance Annotation Similarity Filter (ASF) by comparing the synthetic annotation with the response of CLIP to remove the samples related to low-quality annotations. The experimental results show that using our method significantly reduces the data size by half, while the trained segmenter achieves higher performance.

Cite

Text

Tang et al. "A Training-Free Synthetic Data Selection Method for Semantic Segmentation." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I7.32777

Markdown

[Tang et al. "A Training-Free Synthetic Data Selection Method for Semantic Segmentation." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/tang2025aaai-training/) doi:10.1609/AAAI.V39I7.32777

BibTeX

@inproceedings{tang2025aaai-training,
  title     = {{A Training-Free Synthetic Data Selection Method for Semantic Segmentation}},
  author    = {Tang, Hao and Yu, Siyue and Pang, Jian and Zhang, Bingfeng},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {7229-7237},
  doi       = {10.1609/AAAI.V39I7.32777},
  url       = {https://mlanthology.org/aaai/2025/tang2025aaai-training/}
}