Pre-Defined Keypoints Promote Category-Level Articulation Pose Estimation via Multi-Modal Alignment

Abstract

Articulations are essential in everyday interactions, yet traditional RGB-based pose estimation methods often struggle with issues such as lighting variations and shadows. To overcome these challenges, we propose a novel Pre-defined keypoint based framework for category-level articulation pose estimation via multi-modal Alignment, coined PAGE. Specifically, we first propose a customized keypoint estimation method, aiming to avoid the divergent distance pattern between heuristically generated keypoints and visible points. In addition, to reduce the mutual information redundancy between point clouds and RGB images, we design the geometry-color alignment, which fuses the features after aligning two modalities. This is followed by decoding the radius for each visible point, and applying our proposal integration scoring strategy to predict keypoints. Ultimately, the framework outputs the per-part 6D pose of the articulation. We conduct extensive experiments to evaluate PAGE across a variety of datasets, from synthetic to real-world scenarios, demonstrating its robustness and superior performance.

Cite

Text

Xu et al. "Pre-Defined Keypoints Promote Category-Level Articulation Pose Estimation via Multi-Modal Alignment." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/237

Markdown

[Xu et al. "Pre-Defined Keypoints Promote Category-Level Articulation Pose Estimation via Multi-Modal Alignment." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/xu2025ijcai-pre/) doi:10.24963/IJCAI.2025/237

BibTeX

@inproceedings{xu2025ijcai-pre,
  title     = {{Pre-Defined Keypoints Promote Category-Level Articulation Pose Estimation via Multi-Modal Alignment}},
  author    = {Xu, Wenbo and Zhang, Li and Liu, Liu and Zhong, Yan and Jiang, Haonan and Wang, Xue and Wang, Rujing},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {2125-2133},
  doi       = {10.24963/IJCAI.2025/237},
  url       = {https://mlanthology.org/ijcai/2025/xu2025ijcai-pre/}
}