On the Transfer of Object-Centric Representation Learning
Abstract
The goal of object-centric representation learning is to decompose visual scenes into a structured representation that isolates the entities into individual vectors. Recent successes have shown that object-centric representation learning can be scaled to real-world scenes by utilizing features from pre-trained foundation models like DINO. However, so far, these object-centric methods have mostly been applied in-distribution, with models trained and evaluated on the same dataset. This is in contrast to the underlying foundation models, which have been shown to be applicable to a wide range of data and tasks. Thus, in this work, we answer the question of whether current real-world capable object-centric methods exhibit similar levels of transferability by introducing a benchmark comprising seven different synthetic and real-world datasets. We analyze the factors influencing performance under transfer and find that training on diverse real-world images improves generalization to unseen scenarios. Furthermore, inspired by the success of task-specific fine-tuning in foundation models, we introduce a novel fine-tuning strategy to adapt pre-trained vision encoders for the task of object discovery. We find that the proposed approach results in state-of-the-art performance for unsupervised object discovery, exhibiting strong zero-shot transfer to unseen datasets.
Cite
Text
Didolkar et al. "On the Transfer of Object-Centric Representation Learning." International Conference on Learning Representations, 2025.Markdown
[Didolkar et al. "On the Transfer of Object-Centric Representation Learning." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/didolkar2025iclr-transfer/)BibTeX
@inproceedings{didolkar2025iclr-transfer,
title = {{On the Transfer of Object-Centric Representation Learning}},
author = {Didolkar, Aniket Rajiv and Zadaianchuk, Andrii and Goyal, Anirudh and Mozer, Michael Curtis and Bengio, Yoshua and Martius, Georg and Seitzer, Maximilian},
booktitle = {International Conference on Learning Representations},
year = {2025},
url = {https://mlanthology.org/iclr/2025/didolkar2025iclr-transfer/}
}