LanPose: Language-Instructed 6d Object Pose Estimation for Robotic Assembly
Abstract
Comprehending natural language instructions is a critical skill for robots to cooperate effectively with humans. In this paper, we aim to establish an integrated robotic system to robustly perceive, grasp, manipulate, and assemble blocks by language commands. For this purpose, Language-Instructed 6D Pose Regression Network (LanPose) is proposed to jointly predict the 6D pose of the observed object and the corresponding assembly pose. Our proposed approach is based on the fusion of geometric and linguistic features, which allows us to finely integrate multi-modality input and map it to the 6D pose in SE (3) space by the cross-attention mechanism and the language-integrated 6D pose mapping module, respectively. To effectively train and validate LanPose, a synthetic dataset Block6D is presented with annotations of language instructions for robotic assembly and the corresponding 6D assembly poses. 98.09 and 93.55 in ADD(-S)-0.1d are derived for the prediction of 6D object pose and 6D assembly pose, respectively. The 82.1% success rate of real-robot assembly demonstrates the effectiveness of our methodology.
Cite
Text
Fu et al. "LanPose: Language-Instructed 6d Object Pose Estimation for Robotic Assembly." European Conference on Computer Vision Workshops, 2024. doi:10.1007/978-3-031-91569-7_4Markdown
[Fu et al. "LanPose: Language-Instructed 6d Object Pose Estimation for Robotic Assembly." European Conference on Computer Vision Workshops, 2024.](https://mlanthology.org/eccvw/2024/fu2024eccvw-lanpose/) doi:10.1007/978-3-031-91569-7_4BibTeX
@inproceedings{fu2024eccvw-lanpose,
title = {{LanPose: Language-Instructed 6d Object Pose Estimation for Robotic Assembly}},
author = {Fu, Bowen and Leong, Sek Kun and Di, Yan and Wang, Gu and Tang, Jiwen and Tombari, Federico and Ji, Xiangyang},
booktitle = {European Conference on Computer Vision Workshops},
year = {2024},
pages = {43-59},
doi = {10.1007/978-3-031-91569-7_4},
url = {https://mlanthology.org/eccvw/2024/fu2024eccvw-lanpose/}
}