3DEnhancer: Consistent Multi-View Diffusion for 3D Enhancement

Abstract

Despite advances in neural rendering, due to the scarcity of high-quality 3D datasets and the inherent limitations of multi-view diffusion models, view synthesis and 3D model generation are restricted to low resolutions with suboptimal multi-view consistency. In this study, we present a novel 3D enhancement pipeline, dubbed 3DEnhancer, which employs a multi-view latent diffusion model to enhance coarse 3D inputs while preserving multi-view consistency. Our method includes a pose-aware encoder and a diffusion-based denoiser to refine low-quality multi-view images, along with data augmentation and a multi-view attention module with epipolar aggregation to maintain consistent, high-quality 3D outputs across views. Unlike existing video-based approaches, our model supports seamless multi-view enhancement with improved coherence across diverse viewing angles. Extensive evaluations show that 3DEnhancer significantly outperforms existing methods, boosting both multi-view enhancement and per-instance 3D optimization tasks.

Cite

Text

Luo et al. "3DEnhancer: Consistent Multi-View Diffusion for 3D Enhancement." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.01532

Markdown

[Luo et al. "3DEnhancer: Consistent Multi-View Diffusion for 3D Enhancement." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/luo2025cvpr-3denhancer/) doi:10.1109/CVPR52734.2025.01532

BibTeX

@inproceedings{luo2025cvpr-3denhancer,
  title     = {{3DEnhancer: Consistent Multi-View Diffusion for 3D Enhancement}},
  author    = {Luo, Yihang and Zhou, Shangchen and Lan, Yushi and Pan, Xingang and Loy, Chen Change},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2025},
  pages     = {16430-16440},
  doi       = {10.1109/CVPR52734.2025.01532},
  url       = {https://mlanthology.org/cvpr/2025/luo2025cvpr-3denhancer/}
}