Generalizable Single-View Object Pose Estimation by Two-Side Generating and Matching
Abstract
In this paper we present a novel generalizable object pose estimation method to determine the object pose using only one RGB image. Unlike traditional approaches that rely on instance-level object pose estimation and necessitate extensive training data our method offers generalization to unseen objects without extensive training operates with a single reference image of the object and eliminates the need for 3D object models or multiple views of the object. These characteristics are achieved by utilizing a diffusion model to generate novel-view images and conducting a two-sided matching on these generated images. Quantitative experiments demonstrate the superiority of our method over existing pose estimation techniques across both synthetic and real-world datasets. Remarkably our approach maintains strong performance even in scenarios with significant viewpoint changes highlighting its robustness and versatility in challenging conditions. The code will be released at https://github.com/scy639/Gen2SM
Cite
Text
Sun et al. "Generalizable Single-View Object Pose Estimation by Two-Side Generating and Matching." Winter Conference on Applications of Computer Vision, 2025.Markdown
[Sun et al. "Generalizable Single-View Object Pose Estimation by Two-Side Generating and Matching." Winter Conference on Applications of Computer Vision, 2025.](https://mlanthology.org/wacv/2025/sun2025wacv-generalizable/)BibTeX
@inproceedings{sun2025wacv-generalizable,
title = {{Generalizable Single-View Object Pose Estimation by Two-Side Generating and Matching}},
author = {Sun, Yujing and Sun, Caiyi and Liu, Yuan and Ma, Yuexin and Yiu, Siu Ming},
booktitle = {Winter Conference on Applications of Computer Vision},
year = {2025},
pages = {545-556},
url = {https://mlanthology.org/wacv/2025/sun2025wacv-generalizable/}
}