Object-Centric Learning with Cyclic Walks Between Parts and Whole

Abstract

Learning object-centric representations from complex natural environments enables both humans and machines with reasoning abilities from low-level perceptual features. To capture compositional entities of the scene, we proposed cyclic walks between perceptual features extracted from vision transformers and object entities. First, a slot-attention module interfaces with these perceptual features and produces a finite set of slot representations. These slots can bind to any object entities in the scene via inter-slot competitions for attention. Next, we establish entity-feature correspondence with cyclic walks along high transition probability based on the pairwise similarity between perceptual features (aka "parts") and slot-binded object representations (aka "whole"). The whole is greater than its parts and the parts constitute the whole. The part-whole interactions form cycle consistencies, as supervisory signals, to train the slot-attention module. Our rigorous experiments on \textit{seven} image datasets in \textit{three} \textit{unsupervised} tasks demonstrate that the networks trained with our cyclic walks can disentangle foregrounds and backgrounds, discover objects, and segment semantic objects in complex scenes. In contrast to object-centric models attached with a decoder for the pixel-level or feature-level reconstructions, our cyclic walks provide strong learning signals, avoiding computation overheads and enhancing memory efficiency. Our source code and data are available at: \href{https://github.com/ZhangLab-DeepNeuroCogLab/Parts-Whole-Object-Centric-Learning/}link.

Cite

Text

Wang et al. "Object-Centric Learning with Cyclic Walks Between Parts and Whole." Neural Information Processing Systems, 2023.

Markdown

[Wang et al. "Object-Centric Learning with Cyclic Walks Between Parts and Whole." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/wang2023neurips-objectcentric/)

BibTeX

@inproceedings{wang2023neurips-objectcentric,
  title     = {{Object-Centric Learning with Cyclic Walks Between Parts and Whole}},
  author    = {Wang, Ziyu and Shou, Mike Zheng and Zhang, Mengmi},
  booktitle = {Neural Information Processing Systems},
  year      = {2023},
  url       = {https://mlanthology.org/neurips/2023/wang2023neurips-objectcentric/}
}