DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception

Abstract

Current perceptive models heavily depend on resource-intensive datasets prompting the need for innovative solutions. Leveraging recent advances in diffusion models synthetic data by constructing image inputs from various annotations proves beneficial for downstream tasks. While prior methods have separately addressed generative and perceptive models DetDiffusion for the first time harmonizes both tackling the challenges in generating effective data for perceptive models. To enhance image generation with perceptive models we introduce perception-aware loss (P.A. loss) through segmentation improving both quality and controllability. To boost the performance of specific perceptive models our method customizes data augmentation by extracting and utilizing perception-aware attribute (P.A. Attr) during generation. Experimental results from the object detection task highlight DetDiffusion's superior performance establishing a new state-of-the-art in layout-guided generation. Furthermore image syntheses from DetDiffusion can effectively augment training data significantly enhancing downstream detection performance.

Cite

Text

Wang et al. "DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.00692

Markdown

[Wang et al. "DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/wang2024cvpr-detdiffusion/) doi:10.1109/CVPR52733.2024.00692

BibTeX

@inproceedings{wang2024cvpr-detdiffusion,
  title     = {{DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception}},
  author    = {Wang, Yibo and Gao, Ruiyuan and Chen, Kai and Zhou, Kaiqiang and Cai, Yingjie and Hong, Lanqing and Li, Zhenguo and Jiang, Lihui and Yeung, Dit-Yan and Xu, Qiang and Zhang, Kai},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {7246-7255},
  doi       = {10.1109/CVPR52733.2024.00692},
  url       = {https://mlanthology.org/cvpr/2024/wang2024cvpr-detdiffusion/}
}