"What Happens If..." Learning to Predict the Effect of Forces in Images

Abstract

What happens if one pushes a cup sitting on a table toward the edge of the table? How about pushing a desk against a wall? In this paper, we study the problem of understanding the movements of objects as a result of applying external forces to them. For a given force vector applied to a specific location in an image, our goal is to predict long-term sequential movements caused by that force. Doing so entails reasoning about scene geometry, objects, their attributes, and the physical rules that govern the movements of objects. We design a deep neural network model that learns long-term sequential dependencies of object movements while taking into account the geometry and appearance of the scene by combining Convolutional and Recurrent Neural Networks. Training our model requires a large-scale dataset of object movements caused by external forces. To build a dataset of forces in scenes, we reconstructed all images in SUN RGB-D dataset in a physics simulator to estimate the physical movements of objects caused by external forces applied to them. Our Forces in Scenes (ForScene) dataset contains 65,000 object movements in 3D which represent a variety of external forces applied to different types of objects. Our experimental evaluations show that the challenging task of predicting long-term movements of objects as their reaction to external forces is possible from a single image. The code and dataset are available at: http://allenai.org/plato/forces.

Cite

Text

Mottaghi et al. ""What Happens If..." Learning to Predict the Effect of Forces in Images." European Conference on Computer Vision, 2016. doi:10.1007/978-3-319-46493-0_17

Markdown

[Mottaghi et al. ""What Happens If..." Learning to Predict the Effect of Forces in Images." European Conference on Computer Vision, 2016.](https://mlanthology.org/eccv/2016/mottaghi2016eccv-happens/) doi:10.1007/978-3-319-46493-0_17

BibTeX

@inproceedings{mottaghi2016eccv-happens,
  title     = {{"What Happens If..." Learning to Predict the Effect of Forces in Images}},
  author    = {Mottaghi, Roozbeh and Rastegari, Mohammad and Gupta, Abhinav and Farhadi, Ali},
  booktitle = {European Conference on Computer Vision},
  year      = {2016},
  pages     = {269-285},
  doi       = {10.1007/978-3-319-46493-0_17},
  url       = {https://mlanthology.org/eccv/2016/mottaghi2016eccv-happens/}
}