VITON: An Image-Based Virtual Try-on Network

Abstract

We present an image-based VIirtual Try-On Network (VITON) without using 3D information in any form, which seamlessly transfers a desired clothing item onto the corresponding region of a person using a coarse-to-fine strategy. Conditioned upon a new clothing-agnostic yet descriptive person representation, our framework first generates a coarse synthesized image with the target clothing item overlaid on that same person in the same pose. We further enhance the initial blurry clothing area with a refinement network. The network is trained to learn how much detail to utilize from the target clothing item, and where to apply to the person in order to synthesize a photo-realistic image in which the target item deforms naturally with clear visual patterns. Experiments on our newly collected Zalando dataset demonstrate its promise in the image-based virtual try-on task over state-of-the-art generative models.

Cite

Text

Han et al. "VITON: An Image-Based Virtual Try-on Network." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018. doi:10.1109/CVPR.2018.00787

Markdown

[Han et al. "VITON: An Image-Based Virtual Try-on Network." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018.](https://mlanthology.org/cvpr/2018/han2018cvpr-viton/) doi:10.1109/CVPR.2018.00787

BibTeX

@inproceedings{han2018cvpr-viton,
  title     = {{VITON: An Image-Based Virtual Try-on Network}},
  author    = {Han, Xintong and Wu, Zuxuan and Wu, Zhe and Yu, Ruichi and Davis, Larry S.},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2018},
  doi       = {10.1109/CVPR.2018.00787},
  url       = {https://mlanthology.org/cvpr/2018/han2018cvpr-viton/}
}