NeRF-RPN: A General Framework for Object Detection in NeRFs

Abstract

This paper presents the first significant object detection framework, NeRF-RPN, which directly operates on NeRF. Given a pre-trained NeRF model, NeRF-RPN aims to detect all bounding boxes of objects in a scene. By exploiting a novel voxel representation that incorporates multi-scale 3D neural volumetric features, we demonstrate it is possible to regress the 3D bounding boxes of objects in NeRF directly without rendering the NeRF at any viewpoint. NeRF-RPN is a general framework and can be applied to detect objects without class labels. We experimented NeRF-RPN with various backbone architectures, RPN head designs, and loss functions. All of them can be trained in an end-to-end manner to estimate high quality 3D bounding boxes. To facilitate future research in object detection for NeRF, we built a new benchmark dataset which consists of both synthetic and real-world data with careful labeling and clean up. Code and dataset are available at https://github.com/lyclyc52/NeRF_RPN.

Cite

Text

Hu et al. "NeRF-RPN: A General Framework for Object Detection in NeRFs." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.02253

Markdown

[Hu et al. "NeRF-RPN: A General Framework for Object Detection in NeRFs." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/hu2023cvpr-nerfrpn/) doi:10.1109/CVPR52729.2023.02253

BibTeX

@inproceedings{hu2023cvpr-nerfrpn,
  title     = {{NeRF-RPN: A General Framework for Object Detection in NeRFs}},
  author    = {Hu, Benran and Huang, Junkai and Liu, Yichen and Tai, Yu-Wing and Tang, Chi-Keung},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2023},
  pages     = {23528-23538},
  doi       = {10.1109/CVPR52729.2023.02253},
  url       = {https://mlanthology.org/cvpr/2023/hu2023cvpr-nerfrpn/}
}