DSGN: Deep Stereo Geometry Network for 3D Object Detection

Abstract

Most state-of-the-art 3D object detectors rely heavily on LiDAR sensors and there remains a large gap in terms of performance between image-based and LiDAR-based methods, caused by inappropriate representation for the prediction in 3D scenarios. Our method, called Deep Stereo Geometry Network (DSGN), reduces this gap significantly by detecting 3D objects on a differentiable volumetric representation -- 3D geometric volume, which effectively encodes 3D geometric structure for 3D regular space. With this representation, we learn depth information and semantic cues simultaneously. For the first time, we provide a simple and effective one-stage stereo-based 3D detection pipeline that jointly estimates the depth and detects 3D objects in an end-to-end learning manner. Our approach outperforms previous stereo-based 3D detectors (about 10 higher in terms of AP) and even achieves comparable performance with a few LiDAR-based methods on the KITTI 3D object detection leaderboard. Code will be made publicly available at https://github.com/chenyilun95/DSGN.

Cite

Text

Chen et al. "DSGN: Deep Stereo Geometry Network for 3D Object Detection." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020. doi:10.1109/CVPR42600.2020.01255

Markdown

[Chen et al. "DSGN: Deep Stereo Geometry Network for 3D Object Detection." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.](https://mlanthology.org/cvpr/2020/chen2020cvpr-dsgn/) doi:10.1109/CVPR42600.2020.01255

BibTeX

@inproceedings{chen2020cvpr-dsgn,
  title     = {{DSGN: Deep Stereo Geometry Network for 3D Object Detection}},
  author    = {Chen, Yilun and Liu, Shu and Shen, Xiaoyong and Jia, Jiaya},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2020},
  doi       = {10.1109/CVPR42600.2020.01255},
  url       = {https://mlanthology.org/cvpr/2020/chen2020cvpr-dsgn/}
}