Uni3D: A Unified Baseline for Multi-Dataset 3D Object Detection

Abstract

Current 3D object detection models follow a single dataset-specific training and testing paradigm, which often faces a serious detection accuracy drop when they are directly deployed in another dataset. In this paper, we study the task of training a unified 3D detector from multiple datasets. We observe that this appears to be a challenging task, which is mainly due to that these datasets present substantial data-level differences and taxonomy-level variations caused by different LiDAR types and data acquisition standards. Inspired by such observation, we present a Uni3D which leverages a simple data-level correction operation and a designed semantic-level coupling-and-recoupling module to alleviate the unavoidable data-level and taxonomy-level differences, respectively. Our method is simple and easily combined with many 3D object detection baselines such as PV-RCNN and Voxel-RCNN, enabling them to effectively learn from multiple off-the-shelf 3D datasets to obtain more discriminative and generalizable representations. Experiments are conducted on many dataset consolidation settings. Their results demonstrate that Uni3D exceeds a series of individual detectors trained on a single dataset, with a 1.04x parameter increase over a selected baseline detector. We expect this work will inspire the research of 3D generalization since it will push the limits of perceptual performance. Our code is available at: https://github.com/PJLab-ADG/3DTrans

Cite

Text

Zhang et al. "Uni3D: A Unified Baseline for Multi-Dataset 3D Object Detection." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.00893

Markdown

[Zhang et al. "Uni3D: A Unified Baseline for Multi-Dataset 3D Object Detection." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/zhang2023cvpr-uni3d/) doi:10.1109/CVPR52729.2023.00893

BibTeX

@inproceedings{zhang2023cvpr-uni3d,
  title     = {{Uni3D: A Unified Baseline for Multi-Dataset 3D Object Detection}},
  author    = {Zhang, Bo and Yuan, Jiakang and Shi, Botian and Chen, Tao and Li, Yikang and Qiao, Yu},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2023},
  pages     = {9253-9262},
  doi       = {10.1109/CVPR52729.2023.00893},
  url       = {https://mlanthology.org/cvpr/2023/zhang2023cvpr-uni3d/}
}