Enhancing Multi-Modal Features Using Local Self-Attention for 3D Object Detection

Abstract

LiDAR and Camera sensors have complementary properties: LiDAR senses accurate positioning, while camera provides rich texture and color information. Fusing these two modalities can intuitively improve the performance of 3D detection. Most multi-modal fusion methods use networks to extract features of LiDAR and camera modality respectively, then simply add or concancate them together. We argue that these two kinds of signals are completely different, so it is not proper to combine these two heterogeneous features directly. In this paper, we propose EMMF-Det to do multi-modal fusion leveraging range and camera images. EMMF-Det uses self-attention mechanism to do feature re-weighting on these two modalities interactively, which can enchance the features with color, texture and localiztion information provided by LiDAR and camera signals. On the Waymo Open Dataset, EMMF-Det acheives the state-of-the-art performance. Besides this, evaluation on self-built dataset further proves the effectiveness of our method.

Cite

Text

Li et al. "Enhancing Multi-Modal Features Using Local Self-Attention for 3D Object Detection." Proceedings of the European Conference on Computer Vision (ECCV), 2022. doi:10.1007/978-3-031-20080-9_31

Markdown

[Li et al. "Enhancing Multi-Modal Features Using Local Self-Attention for 3D Object Detection." Proceedings of the European Conference on Computer Vision (ECCV), 2022.](https://mlanthology.org/eccv/2022/li2022eccv-enhancing/) doi:10.1007/978-3-031-20080-9_31

BibTeX

@inproceedings{li2022eccv-enhancing,
  title     = {{Enhancing Multi-Modal Features Using Local Self-Attention for 3D Object Detection}},
  author    = {Li, Hao and Zhang, Zehan and Zhao, Xian and Wang, Yulong and Shen, Yuxi and Pu, Shiliang and Mao, Hui},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2022},
  doi       = {10.1007/978-3-031-20080-9_31},
  url       = {https://mlanthology.org/eccv/2022/li2022eccv-enhancing/}
}