Self-Supervised Learning to Discover Physical Objects and Predict Their Interactions from Raw Videos
Abstract
The ability to discover objects from raw videos and to predict their future dynamics is crucial for achieving general intelligence. While existing methods accomplish these two tasks separately, i.e., learning object segmentation with fixed dynamics or learning dynamics with known system states, we explore the feasibility of jointly accomplishing the two together in a self-supervised setting for physical environments. Critically, we show on real video datasets that learning object dynamics improves the accuracy of discovering dynamical objects.
Cite
Text
Cheng et al. "Self-Supervised Learning to Discover Physical Objects and Predict Their Interactions from Raw Videos." NeurIPS 2023 Workshops: AI4Science, 2023.Markdown
[Cheng et al. "Self-Supervised Learning to Discover Physical Objects and Predict Their Interactions from Raw Videos." NeurIPS 2023 Workshops: AI4Science, 2023.](https://mlanthology.org/neuripsw/2023/cheng2023neuripsw-selfsupervised/)BibTeX
@inproceedings{cheng2023neuripsw-selfsupervised,
title = {{Self-Supervised Learning to Discover Physical Objects and Predict Their Interactions from Raw Videos}},
author = {Cheng, Sheng and Yang, Yezhou and Jiao, Yang and Ren, Yi},
booktitle = {NeurIPS 2023 Workshops: AI4Science},
year = {2023},
url = {https://mlanthology.org/neuripsw/2023/cheng2023neuripsw-selfsupervised/}
}