HVPR: Hybrid Voxel-Point Representation for Single-Stage 3D Object Detection

Noh, Jongyoun; Lee, Sanghoon; Ham, Bumsub

doi:10.1109/CVPR46437.2021.01437

HVPR: Hybrid Voxel-Point Representation for Single-Stage 3D Object Detection

Jongyoun Noh, Sanghoon Lee, Bumsub Ham

CVPR 2021 pp. 14605-14614

doi:10.1109/CVPR46437.2021.01437 /cvpr/2021/noh2021cvpr-hvpr/

Abstract

We address the problem of 3D object detection, that is, estimating 3D object bounding boxes from point clouds. 3D object detection methods exploit either voxel-based or point-based features to represent 3D objects in a scene. Voxel-based features are efficient to extract, while they fail to preserve fine-grained 3D structures of objects. Point-based features, on the other hand, represent the 3D structures more accurately, but extracting these features is computationally expensive. We introduce in this paper a novel single-stage 3D detection method having the merit of both voxel-based and point-based features. To this end, we propose a new convolutional neural network (CNN) architecture, dubbed HVPR, that integrates both features into a single 3D representation effectively and efficiently. Specifically, we augment the point-based features with a memory module to reduce the computational cost. We then aggregate the features in the memory, semantically similar to each voxel-based one, to obtain a hybrid 3D representation in a form of a pseudo image, allowing to localize 3D objects in a single stage efficiently. We also propose an Attentive Multi-scale Feature Module (AMFM) that extracts scale-aware features considering the sparse and irregular patterns of point clouds. Experimental results on the KITTI dataset demonstrate the effectiveness and efficiency of our approach, achieving a better compromise in terms of speed and accuracy.

PDF CVPR Semantic Scholar

Cite

Text

Noh et al. "HVPR: Hybrid Voxel-Point Representation for Single-Stage 3D Object Detection." Conference on Computer Vision and Pattern Recognition, 2021. doi:10.1109/CVPR46437.2021.01437

Markdown

[Noh et al. "HVPR: Hybrid Voxel-Point Representation for Single-Stage 3D Object Detection." Conference on Computer Vision and Pattern Recognition, 2021.](https://mlanthology.org/cvpr/2021/noh2021cvpr-hvpr/) doi:10.1109/CVPR46437.2021.01437

BibTeX

@inproceedings{noh2021cvpr-hvpr,
  title     = {{HVPR: Hybrid Voxel-Point Representation for Single-Stage 3D Object Detection}},
  author    = {Noh, Jongyoun and Lee, Sanghoon and Ham, Bumsub},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2021},
  pages     = {14605-14614},
  doi       = {10.1109/CVPR46437.2021.01437},
  url       = {https://mlanthology.org/cvpr/2021/noh2021cvpr-hvpr/}
}