FashionTailor: Controllable Clothing Editing for Human Images with Appearance Preserving

Abstract

The garment structure serves as a crucial medium for expressing the designer's creative vision and showcasing the distinctive character of clothing items. Effective editing of garment structure in fashion images allows for an advanced preview of the design, accelerating the process of garment customization to meet individualized requirements. Although large-scale diffusion models have demonstrated impressive image generation and editing capabilities, no efforts have been made to exploit their potential in part-level editing of images. Unlike previous research, we define a clothing structure editing (CSE) task aimed at accurately editing the local structure of human-centered clothing images through simple instruction-based prompts while maintaining the consistency of clothing appearance. Specifically, this paper develops a new controllable triple-flow framework for structure editing named FashionTailor. An additional network called ClothingNet is proposed to extract the clothing details to address the rigid constraints of the original garment structure. Then, we propose a semantic-refined module to extract the semantic understanding of the source image and adaptively focus on the part to be edited. We also design a cross-blend attention mechanism to integrate fine-grained clothing features to guarantee precise alignment between appearance and target structure features. In addition, a garment structure dataset called StructureFashion has been collated, wherein each item of clothing is represented by multiple photos with diverse structure characteristics, containing over six million pairs. Finally, our method supports editing the structure of multiple parts on a garment simultaneously. Extensive experiments validate the effectiveness of our method for editing part-level human images in StructureFashion dataset and real-scenarios.

Cite

Text

Hou et al. "FashionTailor: Controllable Clothing Editing for Human Images with Appearance Preserving." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I4.32364

Markdown

[Hou et al. "FashionTailor: Controllable Clothing Editing for Human Images with Appearance Preserving." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/hou2025aaai-fashiontailor/) doi:10.1609/AAAI.V39I4.32364

BibTeX

@inproceedings{hou2025aaai-fashiontailor,
  title     = {{FashionTailor: Controllable Clothing Editing for Human Images with Appearance Preserving}},
  author    = {Hou, Jie and Ma, Jianghong and Mu, Xiangyu and Zhang, Haijun and Zhang, Zhao},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {3509-3517},
  doi       = {10.1609/AAAI.V39I4.32364},
  url       = {https://mlanthology.org/aaai/2025/hou2025aaai-fashiontailor/}
}