Learning to Describe Scenes with Programs

Abstract

Human scene perception goes beyond recognizing a collection of objects and their pairwise relations. We understand higher-level, abstract regularities within the scene such as symmetry and repetition. Current vision recognition modules and scene representations fall short in this dimension. In this paper, we present scene programs, representing a scene via a symbolic program for its objects, attributes, and their relations. We also propose a model that infers such scene programs by exploiting a hierarchical, object-based scene representation. Experiments demonstrate that our model works well on synthetic data and transfers to real images with such compositional structure. The use of scene programs has enabled a number of applications, such as complex visual analogy-making and scene extrapolation.

Cite

Text

Liu et al. "Learning to Describe Scenes with Programs." International Conference on Learning Representations, 2019.

Markdown

[Liu et al. "Learning to Describe Scenes with Programs." International Conference on Learning Representations, 2019.](https://mlanthology.org/iclr/2019/liu2019iclr-learning-a/)

BibTeX

@inproceedings{liu2019iclr-learning-a,
  title     = {{Learning to Describe Scenes with Programs}},
  author    = {Liu, Yunchao and Wu, Zheng and Ritchie, Daniel and Freeman, William T. and Tenenbaum, Joshua B. and Wu, Jiajun},
  booktitle = {International Conference on Learning Representations},
  year      = {2019},
  url       = {https://mlanthology.org/iclr/2019/liu2019iclr-learning-a/}
}