4D-Net for Learned Multi-Modal Alignment

Aj Piergiovanni, Vincent Casser, Michael S. Ryoo, Anelia Angelova

ICCV 2021 pp. 15435-15445

doi:10.1109/ICCV48922.2021.01515 /iccv/2021/piergiovanni2021iccv-4dnet/

Abstract

We present 4D-Net, a 3D object detection approach, which utilizes 3D Point Cloud and RGB sensing information, both in time. We are able to incorporate the 4D information by performing a novel dynamic connection learning across various feature representations and levels of abstraction and by observing geometric constraints. Our approach outperforms the state-of-the-art and strong baselines on the Waymo Open Dataset. 4D-Net is better able to use motion cues and dense image information to detect distant objects more successfully. We will open source the code.

PDF ICCV Semantic Scholar

Cite

Text

Piergiovanni et al. "4D-Net for Learned Multi-Modal Alignment." International Conference on Computer Vision, 2021. doi:10.1109/ICCV48922.2021.01515

Markdown

[Piergiovanni et al. "4D-Net for Learned Multi-Modal Alignment." International Conference on Computer Vision, 2021.](https://mlanthology.org/iccv/2021/piergiovanni2021iccv-4dnet/) doi:10.1109/ICCV48922.2021.01515

BibTeX

@inproceedings{piergiovanni2021iccv-4dnet,
  title     = {{4D-Net for Learned Multi-Modal Alignment}},
  author    = {Piergiovanni, Aj and Casser, Vincent and Ryoo, Michael S. and Angelova, Anelia},
  booktitle = {International Conference on Computer Vision},
  year      = {2021},
  pages     = {15435-15445},
  doi       = {10.1109/ICCV48922.2021.01515},
  url       = {https://mlanthology.org/iccv/2021/piergiovanni2021iccv-4dnet/}
}