SO-Pose: Exploiting Self-Occlusion for Direct 6d Pose Estimation
Abstract
Directly regressing all 6 degrees-of-freedom (6DoF) for the object pose (i.e. the 3D rotation and translation) in a cluttered environment from a single RGB image is a challenging problem. While end-to-end methods have recently demonstrated promising results at high efficiency, they are still inferior when compared with elaborate PnP/RANSAC-based approaches in terms of pose accuracy. In this work, we address this shortcoming by means of a novel reason-ing about self-occlusion, in order to establish a two-layer representation for 3D objects which considerably enhances the accuracy of end-to-end 6D pose estimation. Our frame-work, named SO-Pose, takes a single RGB image as input and respectively generates 2D-3D correspondences as well as self-occlusion information harnessing a shared encoder and two separate decoders. Both outputs are then fused to directly regress the 6DoF pose parameters. Incorporating cross-layer consistencies that align correspondences, self-occlusion, and 6D pose, we can further improve accuracy and robustness, surpassing or rivaling all other state-of-the-art approaches on various challenging datasets.
Cite
Text
Di et al. "SO-Pose: Exploiting Self-Occlusion for Direct 6d Pose Estimation." International Conference on Computer Vision, 2021. doi:10.1109/ICCV48922.2021.01217Markdown
[Di et al. "SO-Pose: Exploiting Self-Occlusion for Direct 6d Pose Estimation." International Conference on Computer Vision, 2021.](https://mlanthology.org/iccv/2021/di2021iccv-sopose/) doi:10.1109/ICCV48922.2021.01217BibTeX
@inproceedings{di2021iccv-sopose,
title = {{SO-Pose: Exploiting Self-Occlusion for Direct 6d Pose Estimation}},
author = {Di, Yan and Manhardt, Fabian and Wang, Gu and Ji, Xiangyang and Navab, Nassir and Tombari, Federico},
booktitle = {International Conference on Computer Vision},
year = {2021},
pages = {12396-12405},
doi = {10.1109/ICCV48922.2021.01217},
url = {https://mlanthology.org/iccv/2021/di2021iccv-sopose/}
}