Identifiable Object Representations Under Spatial Ambiguities

Abstract

Modular object-centric representations are essential for human-like reasoning but are challenging to obtain under spatial ambiguities, e.g. due to occlusions and view ambiguities. However, addressing challenges presents both theoretical and practical difficulties. We introduce a novel multi-view probabilistic approach that aggregates view-specific slots to capture invariant content information while simultaneously learning disentangled global viewpoint-level information. Unlike prior single-view methods, our approach resolves spatial ambiguities, provides theoretical guarantees for identifiability, and requires no viewpoint annotations. Extensive experiments on standard benchmarks and novel complex datasets validate our method’s robustness and scalability.

Cite

Text

Kori et al. "Identifiable Object Representations Under Spatial Ambiguities." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Kori et al. "Identifiable Object Representations Under Spatial Ambiguities." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/kori2025icml-identifiable/)

BibTeX

@inproceedings{kori2025icml-identifiable,
  title     = {{Identifiable Object Representations Under Spatial Ambiguities}},
  author    = {Kori, Avinash and Toni, Francesca and Glocker, Ben},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {31486-31518},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/kori2025icml-identifiable/}
}