GoLF-NRT: Integrating Global Context and Local Geometry for Few-Shot View Synthesis

Abstract

Neural Radiance Fields (NeRF) have transformed novel view synthesis by modeling scene-specific volumetric representations directly from images. While generalizable NeRF models can generate novel views across unknown scenes by learning latent ray representations, their performance heavily depends on a large number of multi-view observations. However, with limited input views, these methods experience significant degradation in rendering quality. To address this limitation, we propose GoLF-NRT: a Global and Local feature Fusion-based Neural Rendering Transformer. GoLF-NRT enhances generalizable neural rendering from few input views by leveraging a 3D transformer with efficient sparse attention to capture global scene context. In parallel, it integrates local geometric features extracted along the epipolar line, enabling high-quality scene reconstruction from as few as 1 to 3 input views. Furthermore, we introduce an adaptive sampling strategy based on attention weights and kernel regression, improving the accuracy of transformer-based neural rendering. Extensive experiments on public datasets show that GoLF-NRT achieves state-of-the-art performance across varying numbers of input views, highlighting the effectiveness and superiority of our approach. Code is available at https://github.com/KLMAV-CUC/GoLF-NRT.

Cite

Text

Wang et al. "GoLF-NRT: Integrating Global Context and Local Geometry for Few-Shot View Synthesis." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.01989

Markdown

[Wang et al. "GoLF-NRT: Integrating Global Context and Local Geometry for Few-Shot View Synthesis." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/wang2025cvpr-golfnrt/) doi:10.1109/CVPR52734.2025.01989

BibTeX

@inproceedings{wang2025cvpr-golfnrt,
  title     = {{GoLF-NRT: Integrating Global Context and Local Geometry for Few-Shot View Synthesis}},
  author    = {Wang, You and Fang, Li and Zhu, Hao and Hu, Fei and Ye, Long and Ma, Zhan},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2025},
  pages     = {21349-21359},
  doi       = {10.1109/CVPR52734.2025.01989},
  url       = {https://mlanthology.org/cvpr/2025/wang2025cvpr-golfnrt/}
}