LaPose: Laplacian Mixture Shape Modeling for RGB-Based Category-Level Object Pose Estimation
Abstract
While RGBD-based methods for category-level object pose estimation hold promise, their reliance on depth data limits their applicability in diverse scenarios. In response, recent efforts have turned to RGB-based methods; however, they face significant challenges stemming from the absence of depth information. On one hand, the lack of depth exacerbates the difficulty in handling intra-class shape variation, resulting in increased uncertainty in shape predictions. On the other hand, RGB-only inputs introduce inherent scale ambiguity, rendering the estimation of object size and translation an ill-posed problem. To tackle these challenges, we propose LaPose, a novel framework that models the object shape as the Laplacian mixture model for Pose estimation. By representing each point as a probabilistic distribution, we explicitly quantify the shape uncertainty. LaPose leverages both a generalized 3D information stream and a specialized feature stream to independently predict the Laplacian distribution for each point, capturing different aspects of object geometry. These two distributions are then integrated as a Laplacian mixture model to establish the 2D-3D correspondences, which are utilized to solve the pose via the PnP module. In order to mitigate scale ambiguity, we introduce a scale-agnostic representation for object size and translation, enhancing training efficiency and overall robustness. Extensive experiments on the NOCS datasets validate the effectiveness of LaPose, yielding state-of-the-art performance in RGB-based category-level object pose estimation. Codes are released at https://github.com/lolrudy/LaPose.
Cite
Text
Zhang et al. "LaPose: Laplacian Mixture Shape Modeling for RGB-Based Category-Level Object Pose Estimation." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72698-9_27Markdown
[Zhang et al. "LaPose: Laplacian Mixture Shape Modeling for RGB-Based Category-Level Object Pose Estimation." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/zhang2024eccv-lapose/) doi:10.1007/978-3-031-72698-9_27BibTeX
@inproceedings{zhang2024eccv-lapose,
title = {{LaPose: Laplacian Mixture Shape Modeling for RGB-Based Category-Level Object Pose Estimation}},
author = {Zhang, Ruida and Huang, Ziqin and Wang, Gu and Zhang, Chenyangguang and Di, Yan and Zuo, Xingxing and Tang, Jiwen and Ji, Xiangyang},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2024},
doi = {10.1007/978-3-031-72698-9_27},
url = {https://mlanthology.org/eccv/2024/zhang2024eccv-lapose/}
}