Mask Does Not Matter: A Unified Latent Diffusion-Enhanced Framework for Mask-Free Virtual Try-on

Abstract

A good virtual try-on model should introduce minimal redundant conditional information to avoid instability and increase inference efficiency. Existing methods rely on inpainting masks to guide the generation of the object, but the masks, generated by unstable human parsers, often produce unreliable results with fabric residues due to wrong segmentation. Moreover, large mask regions can lose spatial structure and identity information, requiring extra conditional inputs to compensate, which increases model instability and reduces efficiency. To tackle the problem, we present a novel Mask-Free virtual Try-ON (MFTON) framework. Specifically, we propose a mask-free strategy to eliminate all denoising conditions except for clothing and person images, thereby directly extracting spatial structure and identity information from the person image to improve efficiency and reduce instability. Additionally, to optimize the generated clothing regions, we propose a clothing texture-aware attention mechanism to enable the model to focus on texture generation with significant visual differences. We then introduce a geometric detail capture loss to further enable the model to capture more high-frequency information. Finally, we propose an appearance consistency inference method to reduce the initial randomness of the sampling process significantly. Extensive experiments on popular datasets demonstrate that our method outperforms state-of-the-art virtual try-on methods.

Cite

Text

Du et al. "Mask Does Not Matter: A Unified Latent Diffusion-Enhanced Framework for Mask-Free Virtual Try-on." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/105

Markdown

[Du et al. "Mask Does Not Matter: A Unified Latent Diffusion-Enhanced Framework for Mask-Free Virtual Try-on." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/du2025ijcai-mask/) doi:10.24963/IJCAI.2025/105

BibTeX

@inproceedings{du2025ijcai-mask,
  title     = {{Mask Does Not Matter: A Unified Latent Diffusion-Enhanced Framework for Mask-Free Virtual Try-on}},
  author    = {Du, Chenghu and Wang, Junyin and Liu, Kai and Xiong, Shengwu and Rong, Yi},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {936-944},
  doi       = {10.24963/IJCAI.2025/105},
  url       = {https://mlanthology.org/ijcai/2025/du2025ijcai-mask/}
}