The Devil Is in the Task: Exploiting Reciprocal Appearance-Localization Features for Monocular 3D Object Detection

Abstract

Low-cost monocular 3D object detection plays a fundamental role in autonomous driving, whereas its accuracy is still far from satisfactory. Our objective is to dig into the 3D object detection task and reformulate it as the sub-tasks of object localization and appearance perception, which benefits to a deep excavation of reciprocal information underlying the entire task. We introduce a Dynamic Feature Reflecting Network, named DFR-Net, which contains two novel standalone modules: (i) the Appearance-Localization Feature Reflecting module (ALFR) that first separates task-specific features and then self-mutually reflects the reciprocal features; (ii) the Dynamic Intra-Trading module (DIT) that adaptively realigns the training processes of various sub-tasks via a self-learning manner. Extensive experiments on the challenging KITTI dataset demonstrate the effectiveness and generalization of DFR-Net. We rank 1st among all the monocular 3D object detectors in the KITTI test set (till March 16th, 2021). The proposed method is also easy to be plug-and-play in many cutting-edge 3D detection frameworks at negligible cost to boost performance. The code will be made publicly available.

Cite

Text

Zou et al. "The Devil Is in the Task: Exploiting Reciprocal Appearance-Localization Features for Monocular 3D Object Detection." International Conference on Computer Vision, 2021. doi:10.1109/ICCV48922.2021.00271

Markdown

[Zou et al. "The Devil Is in the Task: Exploiting Reciprocal Appearance-Localization Features for Monocular 3D Object Detection." International Conference on Computer Vision, 2021.](https://mlanthology.org/iccv/2021/zou2021iccv-devil/) doi:10.1109/ICCV48922.2021.00271

BibTeX

@inproceedings{zou2021iccv-devil,
  title     = {{The Devil Is in the Task: Exploiting Reciprocal Appearance-Localization Features for Monocular 3D Object Detection}},
  author    = {Zou, Zhikang and Ye, Xiaoqing and Du, Liang and Cheng, Xianhui and Tan, Xiao and Zhang, Li and Feng, Jianfeng and Xue, Xiangyang and Ding, Errui},
  booktitle = {International Conference on Computer Vision},
  year      = {2021},
  pages     = {2713-2722},
  doi       = {10.1109/ICCV48922.2021.00271},
  url       = {https://mlanthology.org/iccv/2021/zou2021iccv-devil/}
}