SurfConv: Bridging 3D and 2D Convolution for RGBD Images

Abstract

The last few years have seen approaches trying to combine the increasing popularity of depth sensors and the success of the convolutional neural networks. Using depth as additional channel alongside the RGB input has the scale variance problem present in image convolution based approaches. On the other hand, 3D convolution wastes a large amount of memory on mostly unoccupied 3D space, which consists of only the surface visible to the sensor. Instead, we propose SurfConv, which “slides” compact 2D filters along the visible 3D surface. SurfConv is formulated as a simple depth-aware multi-scale 2D convolution, through a new Data-Driven Depth Discretization (D4) scheme. We demonstrate the effectiveness of our method on indoor and outdoor 3D semantic segmentation datasets. Our method achieves state-of-the-art performance while using less than 30% parameters used by the 3D convolution based approaches.

Cite

Text

Chu et al. "SurfConv: Bridging 3D and 2D Convolution for RGBD Images." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018. doi:10.1109/CVPR.2018.00317

Markdown

[Chu et al. "SurfConv: Bridging 3D and 2D Convolution for RGBD Images." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018.](https://mlanthology.org/cvpr/2018/chu2018cvpr-surfconv/) doi:10.1109/CVPR.2018.00317

BibTeX

@inproceedings{chu2018cvpr-surfconv,
  title     = {{SurfConv: Bridging 3D and 2D Convolution for RGBD Images}},
  author    = {Chu, Hang and Ma, Wei-Chiu and Kundu, Kaustav and Urtasun, Raquel and Fidler, Sanja},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2018},
  doi       = {10.1109/CVPR.2018.00317},
  url       = {https://mlanthology.org/cvpr/2018/chu2018cvpr-surfconv/}
}