Self-Supervised Learning to Discover Physical Objects and Predict Their Interactions from Raw Videos

Abstract

The ability to discover objects from raw videos and to predict their future dynamics is crucial for achieving general intelligence. While existing methods accomplish these two tasks separately, i.e., learning object segmentation with fixed dynamics or learning dynamics with known system states, we explore the feasibility of jointly accomplishing the two together in a self-supervised setting for physical environments. Critically, we show on real video datasets that learning object dynamics improves the accuracy of discovering dynamical objects.

Cite

Text

Cheng et al. "Self-Supervised Learning to Discover Physical Objects and Predict Their Interactions from Raw Videos." NeurIPS 2023 Workshops: AI4Science, 2023.

Markdown

[Cheng et al. "Self-Supervised Learning to Discover Physical Objects and Predict Their Interactions from Raw Videos." NeurIPS 2023 Workshops: AI4Science, 2023.](https://mlanthology.org/neuripsw/2023/cheng2023neuripsw-selfsupervised/)

BibTeX

@inproceedings{cheng2023neuripsw-selfsupervised,
  title     = {{Self-Supervised Learning to Discover Physical Objects and Predict Their Interactions from Raw Videos}},
  author    = {Cheng, Sheng and Yang, Yezhou and Jiao, Yang and Ren, Yi},
  booktitle = {NeurIPS 2023 Workshops: AI4Science},
  year      = {2023},
  url       = {https://mlanthology.org/neuripsw/2023/cheng2023neuripsw-selfsupervised/}
}