SupFusion: Supervised LiDAR-Camera Fusion for 3D Object Detection
Abstract
LiDAR-Camera fusion-based 3D detection is a critical task for automatic driving. In recent years, many LiDAR-Camera fusion approaches sprung up and gained promising performances compared with single-modal detectors, but always lack carefully designed and effective supervision for the fusion process. In this paper, we propose a novel training strategy called SupFusion, which provides an auxiliary feature level supervision for effective LiDAR-Camera fusion and significantly boosts detection performance. Our strategy involves a data enhancement method named Polar Sampling, which densifies sparse objects and trains an assistant model to generate high-quality features as the supervision. These features are then used to train the LiDAR-Camera fusion model, where the fusion feature is optimized to simulate the generated high-quality features. Furthermore, we propose a simple yet effective deep fusion module, which contiguously gains superior performance compared with previous fusion methods with SupFusion strategy. In such a manner, our proposal shares the following advantages. Firstly, SupFusion introduces auxiliary feature-level supervision which could boost LiDAR-Camera detection performance without introducing extra inference costs. Secondly, the proposed deep fusion could continuously improve the detector's abilities. Our proposed SupFusion and deep fusion module is plug-and-play, we make extensive experiments to demonstrate its effectiveness. Specifically, we gain around 2% 3D mAP improvements on KITTI benchmark based on multiple LiDAR-Camera 3D detectors. Our code is available at https://github.com/IranQin/SupFusion.
Cite
Text
Qin et al. "SupFusion: Supervised LiDAR-Camera Fusion for 3D Object Detection." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.02012Markdown
[Qin et al. "SupFusion: Supervised LiDAR-Camera Fusion for 3D Object Detection." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/qin2023iccv-supfusion/) doi:10.1109/ICCV51070.2023.02012BibTeX
@inproceedings{qin2023iccv-supfusion,
title = {{SupFusion: Supervised LiDAR-Camera Fusion for 3D Object Detection}},
author = {Qin, Yiran and Wang, Chaoqun and Kang, Zijian and Ma, Ningning and Li, Zhen and Zhang, Ruimao},
booktitle = {International Conference on Computer Vision},
year = {2023},
pages = {22014-22024},
doi = {10.1109/ICCV51070.2023.02012},
url = {https://mlanthology.org/iccv/2023/qin2023iccv-supfusion/}
}