Unsupervised Segmentation in Real-World Images via Spelke Object Inference
Abstract
Self-supervised, category-agnostic segmentation of real-world images is a challenging open problem in computer vision. Here, we show how to learn static grouping priors from motion self-supervision by building on the cognitive science concept of a Spelke Object: a set of physical stuff that moves together. We introduce the Excitatory-Inhibitory Segment Extraction Network (EISEN), which learns to extract pairwise affinity graphs for static scenes from motion-based training signals. EISEN then produces segments from affinities using a novel graph propagation and competition network. During training, objects that undergo correlated motion (such as robot arms and the objects they move) are decoupled by a bootstrapping process: EISEN explains away the motion of objects it has already learned to segment. We show that EISEN achieves a substantial improvement in the state of the art for self-supervised image segmentation on challenging synthetic and real-world robotics datasets.
Cite
Text
Chen et al. "Unsupervised Segmentation in Real-World Images via Spelke Object Inference." Proceedings of the European Conference on Computer Vision (ECCV), 2022. doi:10.1007/978-3-031-19818-2_41Markdown
[Chen et al. "Unsupervised Segmentation in Real-World Images via Spelke Object Inference." Proceedings of the European Conference on Computer Vision (ECCV), 2022.](https://mlanthology.org/eccv/2022/chen2022eccv-unsupervised/) doi:10.1007/978-3-031-19818-2_41BibTeX
@inproceedings{chen2022eccv-unsupervised,
title = {{Unsupervised Segmentation in Real-World Images via Spelke Object Inference}},
author = {Chen, Honglin and Venkatesh, Rahul and Friedman, Yoni and Wu, Jiajun and Tenenbaum, Joshua B. and Yamins, Daniel L. K. and Bear, Daniel M.},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2022},
doi = {10.1007/978-3-031-19818-2_41},
url = {https://mlanthology.org/eccv/2022/chen2022eccv-unsupervised/}
}