Self-Supervised Interactive Object Segmentation Through a Singulation-and-Grasping Approach
Abstract
Instance segmentation with unseen objects is a challenging problem in unstructured environments. To solve this problem, we propose a robot learning approach to actively interact with novel objects and collect each object’s training label for further fine-tuning to improve the segmentation model performance, while avoiding the time-consuming process of manually labeling a dataset. Given a cluttered pile of objects, our approach chooses pushing and grasping motions to break the clutter and conducts object-agnostic grasping for which the Singulation-and-Grasping (SaG) policy takes as input the visual observations and imperfect segmentation. We decompose the problem into three subtasks: (1) the object singulation subtask aims to separate the objects from each other, which creates more space that alleviates the difficulty of (2) the collision-free grasping subtask; (3) the mask generation subtask obtains the self-labeled ground truth masks by using an optical flow-based binary classifier and motion cue post-processing for transfer learning. Our system achieves 70% singulation success rate in simulated cluttered scenes. The interactive segmentation of our system achieves 87.8%, 73.9%, and 69.3% average precision for toy blocks, YCB objects in simulation, and real-world novel objects, respectively, which outperforms the compared baselines.
Cite
Text
Yu and Choi. "Self-Supervised Interactive Object Segmentation Through a Singulation-and-Grasping Approach." Proceedings of the European Conference on Computer Vision (ECCV), 2022. doi:10.1007/978-3-031-19842-7_36Markdown
[Yu and Choi. "Self-Supervised Interactive Object Segmentation Through a Singulation-and-Grasping Approach." Proceedings of the European Conference on Computer Vision (ECCV), 2022.](https://mlanthology.org/eccv/2022/yu2022eccv-selfsupervised/) doi:10.1007/978-3-031-19842-7_36BibTeX
@inproceedings{yu2022eccv-selfsupervised,
title = {{Self-Supervised Interactive Object Segmentation Through a Singulation-and-Grasping Approach}},
author = {Yu, Houjian and Choi, Changhyun},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2022},
doi = {10.1007/978-3-031-19842-7_36},
url = {https://mlanthology.org/eccv/2022/yu2022eccv-selfsupervised/}
}