3D-Aware Image Synthesis via Learning Structural and Textural Representations

Abstract

Making generative models 3D-aware bridges the 2D image space and the 3D physical world yet remains challenging. Recent attempts equip a Generative Adversarial Network (GAN) with a Neural Radiance Field (NeRF), which maps 3D coordinates to pixel values, as a 3D prior. However, the implicit function in NeRF has a very local receptive field, making the generator hard to become aware of the global structure. Meanwhile, NeRF is built on volume rendering which can be too costly to produce high-resolution results, increasing the optimization difficulty. To alleviate these two problems, we propose a novel framework, termed as VolumeGAN, for high-fidelity 3D-aware image synthesis, through explicitly learning a structural representation and a textural representation. We first learn a feature volume to represent the underlying structure, which is then converted to a feature field using a NeRF-like model. The feature field is further accumulated into a 2D feature map as the textural representation, followed by a neural renderer for appearance synthesis. Such a design enables independent control of the shape and the appearance. Extensive experiments on a wide range of datasets confirm that, our approach achieves sufficiently higher image quality and better 3D control than the previous methods..

Cite

Text

Xu et al. "3D-Aware Image Synthesis via Learning Structural and Textural Representations." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.01788

Markdown

[Xu et al. "3D-Aware Image Synthesis via Learning Structural and Textural Representations." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/xu2022cvpr-3daware/) doi:10.1109/CVPR52688.2022.01788

BibTeX

@inproceedings{xu2022cvpr-3daware,
  title     = {{3D-Aware Image Synthesis via Learning Structural and Textural Representations}},
  author    = {Xu, Yinghao and Peng, Sida and Yang, Ceyuan and Shen, Yujun and Zhou, Bolei},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2022},
  pages     = {18430-18439},
  doi       = {10.1109/CVPR52688.2022.01788},
  url       = {https://mlanthology.org/cvpr/2022/xu2022cvpr-3daware/}
}