Entity Abstraction in Visual Model-Based Reinforcement Learning

Abstract

We present OP3, a framework for model-based reinforcement learning that acquires object representations from raw visual observations without supervision and uses them to predict and plan. To ground these abstract representations of entities to actual objects in the world, we formulate an interactive inference algorithm which incorporates dynamic information in the scene. Our model can handle a variable number of entities by symmetrically processing each object representation with the same locally-scoped function. On block-stacking tasks, OP3 can generalize to novel block configurations and more objects than seen during training, outperforming both a model that assumes access to object supervision and a state-of-the-art video prediction model.

Cite

Text

Veerapaneni et al. "Entity Abstraction in Visual Model-Based Reinforcement Learning." Conference on Robot Learning, 2019.

Markdown

[Veerapaneni et al. "Entity Abstraction in Visual Model-Based Reinforcement Learning." Conference on Robot Learning, 2019.](https://mlanthology.org/corl/2019/veerapaneni2019corl-entity/)

BibTeX

@inproceedings{veerapaneni2019corl-entity,
  title     = {{Entity Abstraction in Visual Model-Based Reinforcement Learning}},
  author    = {Veerapaneni, Rishi and Co-Reyes, John D. and Chang, Michael and Janner, Michael and Finn, Chelsea and Wu, Jiajun and Tenenbaum, Joshua and Levine, Sergey},
  booktitle = {Conference on Robot Learning},
  year      = {2019},
  pages     = {1439-1456},
  volume    = {100},
  url       = {https://mlanthology.org/corl/2019/veerapaneni2019corl-entity/}
}