Improving Data Augmentation for Multi-Modality 3D Object Detection

Abstract

Single-modality object detectors have witnessed a drastic boost in the past few years thanks to the well-explored data augmentation and training techniques. On the contrary, multi-modality detectors adopt relatively simple data augmentation due to difficulty in ensuring cross modality consistency between point clouds and images. Such a limitation hampers fusion effectiveness and performance growth of multi-modality detectors. Therefore, we contribute a pipeline, named transformation flow, to bridge the gap between single and multi-modality data augmentation with transformation reversing and replaying. In addition, considering occlusions, a point in different modalities may be occupied by different objects, making augmentations such as cut and paste non-trivial for multi-modality detection. We further present Multi-mOdality Cut and pAste (MoCa), which simultaneously considers occlusion and physical plausibility to maintain the multi-modality consistency. Without using ensemble of detectors, our multi-modality detector achieves new state-of-the-art performance on nuScenes dataset and competitive performance on KITTI 3D benchmark. Code and models will be released.

Cite

Text

Zhang et al. "Improving Data Augmentation for Multi-Modality 3D Object Detection." ICLR 2023 Workshops: SR4AD, 2023.

Markdown

[Zhang et al. "Improving Data Augmentation for Multi-Modality 3D Object Detection." ICLR 2023 Workshops: SR4AD, 2023.](https://mlanthology.org/iclrw/2023/zhang2023iclrw-improving/)

BibTeX

@inproceedings{zhang2023iclrw-improving,
  title     = {{Improving Data Augmentation for Multi-Modality 3D Object Detection}},
  author    = {Zhang, Wenwei and Wang, Zhe and Loy, Chen Change},
  booktitle = {ICLR 2023 Workshops: SR4AD},
  year      = {2023},
  url       = {https://mlanthology.org/iclrw/2023/zhang2023iclrw-improving/}
}