D4-VTON: Dynamic Semantics Disentangling for Differential Diffusion Based Virtual Try-on

Abstract

In this paper, we introduce D4 -VTON, an innovative solution for image-based virtual try-on. We address challenges from previous studies, such as semantic inconsistencies before and after garment warping, and reliance on static, annotation-driven clothing parsers. Additionally, we tackle the complexities in diffusion-based VTON models when handling simultaneous tasks like inpainting and denoising. Our approach utilizes two key technologies: Firstly, Dynamic Semantics Disentangling Modules (DSDMs) extract abstract semantic information from garments to create distinct local flows, improving precise garment warping in a self-discovered manner. Secondly, by integrating a Differential Information Tracking Path (DITP), we establish a novel diffusion-based VTON paradigm. This path captures differential information between incomplete try-on inputs and their complete versions, enabling the network to handle multiple degradations independently, thereby minimizing learning ambiguities and achieving realistic results with minimal overhead. Extensive experiments demonstrate that D4 -VTON significantly outperforms existing methods in both quantitative metrics and qualitative evaluations, demonstrating its capability in generating realistic images and ensuring semantic consistency. Code is available at https://github.com/Jerome-Young/D4-VTON.

Cite

Text

Yang et al. "D4-VTON: Dynamic Semantics Disentangling for Differential Diffusion Based Virtual Try-on." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72952-2_3

Markdown

[Yang et al. "D4-VTON: Dynamic Semantics Disentangling for Differential Diffusion Based Virtual Try-on." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/yang2024eccv-d4vton/) doi:10.1007/978-3-031-72952-2_3

BibTeX

@inproceedings{yang2024eccv-d4vton,
  title     = {{D4-VTON: Dynamic Semantics Disentangling for Differential Diffusion Based Virtual Try-on}},
  author    = {Yang, Zhaotong and Jiang, Zicheng and Li, Xinzhe and Zhou, Huiyu and Dong, Junyu and Zhang, Huaidong and Du, Yong},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-72952-2_3},
  url       = {https://mlanthology.org/eccv/2024/yang2024eccv-d4vton/}
}