LaPose: Laplacian Mixture Shape Modeling for RGB-Based Category-Level Object Pose Estimation

Zhang, Ruida; Huang, Ziqin; Wang, Gu; Zhang, Chenyangguang; Di, Yan; Zuo, Xingxing; Tang, Jiwen; Ji, Xiangyang

doi:10.1007/978-3-031-72698-9_27

LaPose: Laplacian Mixture Shape Modeling for RGB-Based Category-Level Object Pose Estimation

Ruida Zhang, Ziqin Huang, Gu Wang, Chenyangguang Zhang, Yan Di, Xingxing Zuo, Jiwen Tang, Xiangyang Ji

ECCV 2024

doi:10.1007/978-3-031-72698-9_27 /eccv/2024/zhang2024eccv-lapose/

Abstract

While RGBD-based methods for category-level object pose estimation hold promise, their reliance on depth data limits their applicability in diverse scenarios. In response, recent efforts have turned to RGB-based methods; however, they face significant challenges stemming from the absence of depth information. On one hand, the lack of depth exacerbates the difficulty in handling intra-class shape variation, resulting in increased uncertainty in shape predictions. On the other hand, RGB-only inputs introduce inherent scale ambiguity, rendering the estimation of object size and translation an ill-posed problem. To tackle these challenges, we propose LaPose, a novel framework that models the object shape as the Laplacian mixture model for Pose estimation. By representing each point as a probabilistic distribution, we explicitly quantify the shape uncertainty. LaPose leverages both a generalized 3D information stream and a specialized feature stream to independently predict the Laplacian distribution for each point, capturing different aspects of object geometry. These two distributions are then integrated as a Laplacian mixture model to establish the 2D-3D correspondences, which are utilized to solve the pose via the PnP module. In order to mitigate scale ambiguity, we introduce a scale-agnostic representation for object size and translation, enhancing training efficiency and overall robustness. Extensive experiments on the NOCS datasets validate the effectiveness of LaPose, yielding state-of-the-art performance in RGB-based category-level object pose estimation. Codes are released at https://github.com/lolrudy/LaPose.

PDF ECCV Semantic Scholar

Cite

Text

Zhang et al. "LaPose: Laplacian Mixture Shape Modeling for RGB-Based Category-Level Object Pose Estimation." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72698-9_27

Markdown

[Zhang et al. "LaPose: Laplacian Mixture Shape Modeling for RGB-Based Category-Level Object Pose Estimation." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/zhang2024eccv-lapose/) doi:10.1007/978-3-031-72698-9_27

BibTeX

@inproceedings{zhang2024eccv-lapose,
  title     = {{LaPose: Laplacian Mixture Shape Modeling for RGB-Based Category-Level Object Pose Estimation}},
  author    = {Zhang, Ruida and Huang, Ziqin and Wang, Gu and Zhang, Chenyangguang and Di, Yan and Zuo, Xingxing and Tang, Jiwen and Ji, Xiangyang},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-72698-9_27},
  url       = {https://mlanthology.org/eccv/2024/zhang2024eccv-lapose/}
}