Collecting the Puzzle Pieces: Disentangled Self-Driven Human Pose Transfer by Permuting Textures

Abstract

Human pose transfer synthesizes new view(s) of a person for a given pose. Recent work achieves this via self-reconstruction, which disentangles a person's pose and texture information by breaking down the person into several parts, then recombines them to reconstruct the person. However, this part-level disentanglement preserves some pose information that can create unwanted artifacts. In this paper, we propose Pose Transfer by Permuting Textures, a self-driven human pose transfer approach that disentangles pose from texture at the patch-level. Specifically, we remove pose from an input image by permuting image patches so only texture information remains. Then we reconstruct the input image by sampling from the permuted textures to achieve patch-level disentanglement. To reduce the noise and recover clothing shape information from the permuted patches, we employ encoders with multiple kernel sizes in a triple branch network. Extensive experiments on DeepFashion and Market-1501 show that our model improves the quality of generated images in terms of FID, LPIPS and SSIM over other self-driven methods, and even outperforming some fully-supervised methods. A user study also shows that among self-driven approaches, images generated by our method are preferred in 68% of cases over prior work. Code is available at https://github.com/NannanLi999/pt_square.

Cite

Text

Li et al. "Collecting the Puzzle Pieces: Disentangled Self-Driven Human Pose Transfer by Permuting Textures." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00655

Markdown

[Li et al. "Collecting the Puzzle Pieces: Disentangled Self-Driven Human Pose Transfer by Permuting Textures." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/li2023iccv-collecting/) doi:10.1109/ICCV51070.2023.00655

BibTeX

@inproceedings{li2023iccv-collecting,
  title     = {{Collecting the Puzzle Pieces: Disentangled Self-Driven Human Pose Transfer by Permuting Textures}},
  author    = {Li, Nannan and Shih, Kevin J and Plummer, Bryan A.},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {7126-7137},
  doi       = {10.1109/ICCV51070.2023.00655},
  url       = {https://mlanthology.org/iccv/2023/li2023iccv-collecting/}
}