Pose Guided Human Image Synthesis by View Disentanglement and Enhanced Weighting Loss

Abstract

View synthesis aims at generating a novel, unseen view of an object. This is a challenging task in the presence of occlusions and asymmetries. In this paper, we present View-Disentangled Generator (VDG), a two-stage deep network for pose-guided human-image generation that performs coarse view prediction followed by a refinement stage. In the first stage, the network predicts the output from a target human pose, the source-image and the corresponding human pose, which are processed in different branches separately. This enables the network to learn a disentangled representation from the source and target view. In the second stage, the coarse output from the first stage is refined by adversarial training. Specifically, we introduce a masked version of the structural similarity loss that facilitates the network to focus on generating a higher quality view. Experiments on Market-1501 and DeepFashion demonstrate the effectiveness of the proposed generator.

Cite

Text

Lakhal et al. "Pose Guided Human Image Synthesis by View Disentanglement and Enhanced Weighting Loss." European Conference on Computer Vision Workshops, 2018. doi:10.1007/978-3-030-11012-3_30

Markdown

[Lakhal et al. "Pose Guided Human Image Synthesis by View Disentanglement and Enhanced Weighting Loss." European Conference on Computer Vision Workshops, 2018.](https://mlanthology.org/eccvw/2018/lakhal2018eccvw-pose/) doi:10.1007/978-3-030-11012-3_30

BibTeX

@inproceedings{lakhal2018eccvw-pose,
  title     = {{Pose Guided Human Image Synthesis by View Disentanglement and Enhanced Weighting Loss}},
  author    = {Lakhal, Mohamed Ilyes and Lanz, Oswald and Cavallaro, Andrea},
  booktitle = {European Conference on Computer Vision Workshops},
  year      = {2018},
  pages     = {380-394},
  doi       = {10.1007/978-3-030-11012-3_30},
  url       = {https://mlanthology.org/eccvw/2018/lakhal2018eccvw-pose/}
}