Decoupled Pseudo-Labeling for Semi-Supervised Monocular 3D Object Detection
Abstract
We delve into pseudo-labeling for semi-supervised monocular 3D object detection (SSM3OD) and discover two primary issues: a misalignment between the prediction quality of 3D and 2D attributes and the tendency of depth supervision derived from pseudo-labels to be noisy leading to significant optimization conflicts with other reliable forms of supervision. To tackle these issues we introduce a novel decoupled pseudo-labeling (DPL) approach for SSM3OD. Our approach features a Decoupled Pseudo-label Generation (DPG) module designed to efficiently generate pseudo-labels by separately processing 2D and 3D attributes. This module incorporates a unique homography-based method for identifying dependable pseudo-labels in Bird's Eye View (BEV) space specifically for 3D attributes. Additionally we present a Depth Gradient Projection (DGP) module to mitigate optimization conflicts caused by noisy depth supervision of pseudo-labels effectively decoupling the depth gradient and removing conflicting gradients. This dual decoupling strategy--at both the pseudo-label generation and gradient levels--significantly improves the utilization of pseudo-labels in SSM3OD. Our comprehensive experiments on the KITTI benchmark demonstrate the superiority of our method over existing approaches.
Cite
Text
Zhang et al. "Decoupled Pseudo-Labeling for Semi-Supervised Monocular 3D Object Detection." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.01601Markdown
[Zhang et al. "Decoupled Pseudo-Labeling for Semi-Supervised Monocular 3D Object Detection." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/zhang2024cvpr-decoupled/) doi:10.1109/CVPR52733.2024.01601BibTeX
@inproceedings{zhang2024cvpr-decoupled,
title = {{Decoupled Pseudo-Labeling for Semi-Supervised Monocular 3D Object Detection}},
author = {Zhang, Jiacheng and Li, Jiaming and Lin, Xiangru and Zhang, Wei and Tan, Xiao and Han, Junyu and Ding, Errui and Wang, Jingdong and Li, Guanbin},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2024},
pages = {16923-16932},
doi = {10.1109/CVPR52733.2024.01601},
url = {https://mlanthology.org/cvpr/2024/zhang2024cvpr-decoupled/}
}