4D-Net for Learned Multi-Modal Alignment

Abstract

We present 4D-Net, a 3D object detection approach, which utilizes 3D Point Cloud and RGB sensing information, both in time. We are able to incorporate the 4D information by performing a novel dynamic connection learning across various feature representations and levels of abstraction and by observing geometric constraints. Our approach outperforms the state-of-the-art and strong baselines on the Waymo Open Dataset. 4D-Net is better able to use motion cues and dense image information to detect distant objects more successfully. We will open source the code.

Cite

Text

Piergiovanni et al. "4D-Net for Learned Multi-Modal Alignment." International Conference on Computer Vision, 2021. doi:10.1109/ICCV48922.2021.01515

Markdown

[Piergiovanni et al. "4D-Net for Learned Multi-Modal Alignment." International Conference on Computer Vision, 2021.](https://mlanthology.org/iccv/2021/piergiovanni2021iccv-4dnet/) doi:10.1109/ICCV48922.2021.01515

BibTeX

@inproceedings{piergiovanni2021iccv-4dnet,
  title     = {{4D-Net for Learned Multi-Modal Alignment}},
  author    = {Piergiovanni, Aj and Casser, Vincent and Ryoo, Michael S. and Angelova, Anelia},
  booktitle = {International Conference on Computer Vision},
  year      = {2021},
  pages     = {15435-15445},
  doi       = {10.1109/ICCV48922.2021.01515},
  url       = {https://mlanthology.org/iccv/2021/piergiovanni2021iccv-4dnet/}
}