Omni6DPose: A Benchmark and Model for Universal 6d Object Pose Estimation and Tracking
Abstract
6D object pose estimation is crucial in the field of computer vision. However, it suffers from a significant lack of large-scale and diverse datasets, impeding comprehensive model evaluation and curtailing downstream applications. To address these issues, this paper introduces , a substantial benchmark featured by its diversity in object categories, large scale, and variety in object materials. is divided into three main components: (Real 6D Object Pose Estimation Dataset), which includes images annotated with over annotations across instances in categories; (Simulated 6D Object Pose Estimation Dataset), a simulated training set created by mixed reality and physics-based depth simulation; and (Pose Aligned 3D Models), the manually aligned real scanned objects used in and . is inherently challenging due to the substantial variations and ambiguities. To address this issue, we introduce , an enhanced version of the SOTA category-level 6D object pose estimation framework, incorporating two pivotal improvements: Semantic-aware feature extraction and Clustering-based aggregation. Moreover, we provide a comprehensive benchmarking analysis to evaluate the performance of previous methods on this new large-scale dataset in the realms of 6D object pose estimation and pose tracking.
Cite
Text
Zhang et al. "Omni6DPose: A Benchmark and Model for Universal 6d Object Pose Estimation and Tracking." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-73226-3_12Markdown
[Zhang et al. "Omni6DPose: A Benchmark and Model for Universal 6d Object Pose Estimation and Tracking." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/zhang2024eccv-omni6dpose/) doi:10.1007/978-3-031-73226-3_12BibTeX
@inproceedings{zhang2024eccv-omni6dpose,
title = {{Omni6DPose: A Benchmark and Model for Universal 6d Object Pose Estimation and Tracking}},
author = {Zhang, Jiyao and Huang, Weiyao and Peng, Bo and Wu, Mingdong and Hu, Fei and Chen, Zijian and Zhao, Bo and Dong, Hao},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2024},
doi = {10.1007/978-3-031-73226-3_12},
url = {https://mlanthology.org/eccv/2024/zhang2024eccv-omni6dpose/}
}