Grouped Discrete Representation for Object-Centric Learning

Abstract

Object-Centric Learning (OCL) aims to discover objects in images or videos by reconstructing the input. Representative methods achieve this by reconstructing the input as its Variational Autoencoder (VAE) discrete representations, which suppress (super-)pixel noise and enhance object separability. However, these methods treat features as indivisible units, overlooking their compositional attributes, and discretize features via scalar code indexes, losing attribute-level similarities and differences. We propose Grouped Discrete Representation (GDR) for OCL. For better generalization, features are decomposed into combinatorial attributes by organized channel grouping. For better convergence, features are quantized into discrete representations via tuple code indexes. Experiments demonstrate that GDR consistently improves both mainstream and state-of-the-art OCL methods across various datasets. Visualizations further highlight GDR’s superior object separability and interpretability. The source code is available on https://github.com/Genera1Z/GroupedDiscreteRepresentation .

Cite

Text

Zhao et al. "Grouped Discrete Representation for Object-Centric Learning." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2025. doi:10.1007/978-3-032-06106-5_27

Markdown

[Zhao et al. "Grouped Discrete Representation for Object-Centric Learning." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2025.](https://mlanthology.org/ecmlpkdd/2025/zhao2025ecmlpkdd-grouped/) doi:10.1007/978-3-032-06106-5_27

BibTeX

@inproceedings{zhao2025ecmlpkdd-grouped,
  title     = {{Grouped Discrete Representation for Object-Centric Learning}},
  author    = {Zhao, Rongzhen and Wang, Vivienne and Kannala, Juho and Pajarinen, Joni},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2025},
  pages     = {465-480},
  doi       = {10.1007/978-3-032-06106-5_27},
  url       = {https://mlanthology.org/ecmlpkdd/2025/zhao2025ecmlpkdd-grouped/}
}