FaceCLIPNeRF: Text-Driven 3D Face Manipulation Using Deformable Neural Radiance Fields

Abstract

As recent advances in Neural Radiance Fields (NeRF) have enabled high-fidelity 3D face reconstruction and novel view synthesis, its manipulation also became an essential task in 3D vision. However, existing manipulation methods require extensive human labor, such as a user-provided semantic mask and manual attribute search unsuitable for non-expert users. Instead, our approach is designed to require a single text to manipulate a face reconstructed with NeRF. To do so, we first train a scene manipulator, a latent code-conditional deformable NeRF, over a dynamic scene to control a face deformation using the latent code. However, representing a scene deformation with a single latent code is unfavorable for compositing local deformations observed in different instances. As so, our proposed Position-conditional Anchor Compositor (PAC) learns to represent a manipulated scene with spatially varying latent codes. Their renderings with the scene manipulator are then optimized to yield high cosine similarity to a target text in CLIP embedding space for text-driven manipulation. To the best of our knowledge, our approach is the first to address the text-driven manipulation of a face reconstructed with NeRF. Extensive results, comparisons, and ablation studies demonstrate the effectiveness of our approach.

Cite

Text

Hwang et al. "FaceCLIPNeRF: Text-Driven 3D Face Manipulation Using Deformable Neural Radiance Fields." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00321

Markdown

[Hwang et al. "FaceCLIPNeRF: Text-Driven 3D Face Manipulation Using Deformable Neural Radiance Fields." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/hwang2023iccv-faceclipnerf/) doi:10.1109/ICCV51070.2023.00321

BibTeX

@inproceedings{hwang2023iccv-faceclipnerf,
  title     = {{FaceCLIPNeRF: Text-Driven 3D Face Manipulation Using Deformable Neural Radiance Fields}},
  author    = {Hwang, Sungwon and Hyung, Junha and Kim, Daejin and Kim, Min-Jung and Choo, Jaegul},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {3469-3479},
  doi       = {10.1109/ICCV51070.2023.00321},
  url       = {https://mlanthology.org/iccv/2023/hwang2023iccv-faceclipnerf/}
}