Object-Centric World Models from Few-Shot Annotations for Sample-Efficient Reinforcement Learning
Abstract
While deep reinforcement learning (RL) from pixels has achieved remarkable success, its sample inefficiency remains a critical limitation for real-world applications. Model-based RL (MBRL) addresses this by learning a world model to generate simulated experience, but standard approaches that rely on pixel-level reconstruction losses often fail to capture small, task-critical objects in complex, dynamic scenes. We posit that an object-centric (OC) representation can direct model capacity toward semantically meaningful entities, improving dynamics prediction and sample efficiency. In this work, we introduce **OC-STORM**, an object-centric MBRL framework that enhances a learned world model with object representations extracted by a pretrained segmentation network. By conditioning on a minimal number of annotated frames, OC-STORM learns to track decision-relevant object dynamics and inter-object interactions without extensive labeling or access to privileged information. Empirical results demonstrate that OC-STORM significantly outperforms the STORM baseline on the Atari 100k benchmark and achieves state-of-the-art sample efficiency on challenging boss fights in the visually complex game **Hollow Knight**. Our findings underscore the potential of integrating OC priors into MBRL for complex visual domains. Project page: https://oc-storm.weipuzhang.com
Cite
Text
Zhang et al. "Object-Centric World Models from Few-Shot Annotations for Sample-Efficient Reinforcement Learning." International Conference on Learning Representations, 2026.Markdown
[Zhang et al. "Object-Centric World Models from Few-Shot Annotations for Sample-Efficient Reinforcement Learning." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/zhang2026iclr-objectcentric/)BibTeX
@inproceedings{zhang2026iclr-objectcentric,
title = {{Object-Centric World Models from Few-Shot Annotations for Sample-Efficient Reinforcement Learning}},
author = {Zhang, Weipu and Jelley, Adam and McInroe, Trevor and Storkey, Amos and Wang, Gang},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/zhang2026iclr-objectcentric/}
}