GATSBI: Generative Agent-Centric Spatio-Temporal Object Interaction

Abstract

We present GATSBI, a generative model that can transform a sequence of raw observations into a structured latent representation that fully captures the spatio-temporal context of the agent's actions. In vision-based decision-making scenarios, an agent faces complex high-dimensional observations where multiple entities interact with each other. The agent requires a good scene representation of the visual observation that discerns essential components that consistently propagates along the time horizon. Our method, GATSBI, utilizes unsupervised scene representation learning to successfully separate an active agent, static background, and passive objects. GATSBI then models the interactions reflecting the causal relationships among decomposed entities and predicts physically plausible future states. Our model generalizes to a variety of environments where different types of robots and objects dynamically interact with each other. GATSBI achieves superior performance on scene decompo-sition and video prediction compared to its state-of-the-artcounterparts, and can be readily applied to sequential deci-sion making of an intelligent agent.

Cite

Text

Min et al. "GATSBI: Generative Agent-Centric Spatio-Temporal Object Interaction." Conference on Computer Vision and Pattern Recognition, 2021. doi:10.1109/CVPR46437.2021.00309

Markdown

[Min et al. "GATSBI: Generative Agent-Centric Spatio-Temporal Object Interaction." Conference on Computer Vision and Pattern Recognition, 2021.](https://mlanthology.org/cvpr/2021/min2021cvpr-gatsbi/) doi:10.1109/CVPR46437.2021.00309

BibTeX

@inproceedings{min2021cvpr-gatsbi,
  title     = {{GATSBI: Generative Agent-Centric Spatio-Temporal Object Interaction}},
  author    = {Min, Cheol-Hui and Bae, Jinseok and Lee, Junho and Kim, Young Min},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2021},
  pages     = {3074-3083},
  doi       = {10.1109/CVPR46437.2021.00309},
  url       = {https://mlanthology.org/cvpr/2021/min2021cvpr-gatsbi/}
}