InstancePose: Fast 6DoF Pose Estimation for Multiple Objects from a Single RGB Image
Abstract
6DoF object pose estimation depends on positional accuracy, implementation complexity and processing speed. This study presents a method to estimate 6DoF object poses for multi-instance object detection that requires less time and is accurate. The proposed method uses a deep neural network, which outputs 4 types of feature maps: the error object mask, semantic object masks, center vector maps (CVM) and 6D coordinate maps. These feature maps are combined in post processing to detect and estimate multi-object 2D-3D correspondences in parallel for PnP RANSAC estimation. The experiments show that the method can process input RGB images containing 7 different object categories/ instances at a speed of 25 frames per second with competitive accuracy, compared with current state-of-the-art methods, which focus only on some specific conditions.
Cite
Text
Aing et al. "InstancePose: Fast 6DoF Pose Estimation for Multiple Objects from a Single RGB Image." IEEE/CVF International Conference on Computer Vision Workshops, 2021. doi:10.1109/ICCVW54120.2021.00296Markdown
[Aing et al. "InstancePose: Fast 6DoF Pose Estimation for Multiple Objects from a Single RGB Image." IEEE/CVF International Conference on Computer Vision Workshops, 2021.](https://mlanthology.org/iccvw/2021/aing2021iccvw-instancepose/) doi:10.1109/ICCVW54120.2021.00296BibTeX
@inproceedings{aing2021iccvw-instancepose,
title = {{InstancePose: Fast 6DoF Pose Estimation for Multiple Objects from a Single RGB Image}},
author = {Aing, Lee and Lie, Wen-Nung and Chiang, Jui-Chiu and Lin, Guo-Shiang},
booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
year = {2021},
pages = {2621-2630},
doi = {10.1109/ICCVW54120.2021.00296},
url = {https://mlanthology.org/iccvw/2021/aing2021iccvw-instancepose/}
}