GCE-Pose: Global Context Enhancement for Category-Level Object Pose Estimation

Abstract

A key challenge in model-free category-level pose estimation is the extraction of contextual object features that generalize across varying instances within a specific category. Recent approaches leverage foundational features to capture semantic and geometry cues from data. However, these approaches fail under partial visibility. We overcome this with a first-complete-then-aggregate strategy for feature extraction utilizing class priors. In this paper, we present GCE-Pose, a method that enhances pose estimation for novel instances by integrating category-level global context prior. GCE-Pose first performs semantic shape reconstruction with a proposed Semantic Shape Reconstruction (SSR) module. Given an unseen partial RGB-D object instance, our SSR module reconstructs the instance's global geometry and semantics by deforming category-specific 3D semantic prototypes through a learned deep Linear Shape Model. We then introduce a Global Context Enhanced (GCE) feature fusion module that effectively fuses features from partial RGB-D observations and the reconstructed global context. Extensive experiments validate the impact of our global context prior and the effectiveness of the GCE fusion module, demonstrating that GCE-Pose significantly outperforms existing methods on challenging real-world datasets HouseCat6D and NOCS-REAL275.

Cite

Text

Li et al. "GCE-Pose: Global Context Enhancement for Category-Level Object Pose Estimation." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.02529

Markdown

[Li et al. "GCE-Pose: Global Context Enhancement for Category-Level Object Pose Estimation." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/li2025cvpr-gcepose/) doi:10.1109/CVPR52734.2025.02529

BibTeX

@inproceedings{li2025cvpr-gcepose,
  title     = {{GCE-Pose: Global Context Enhancement for Category-Level Object Pose Estimation}},
  author    = {Li, Weihang and Xu, Hongli and Huang, Junwen and Jung, Hyunjun and Yu, Peter KT and Navab, Nassir and Busam, Benjamin},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2025},
  pages     = {27154-27165},
  doi       = {10.1109/CVPR52734.2025.02529},
  url       = {https://mlanthology.org/cvpr/2025/li2025cvpr-gcepose/}
}