Pixels to Graphs by Associative Embedding

Abstract

Graphs are a useful abstraction of image content. Not only can graphs represent details about individual objects in a scene but they can capture the interactions between pairs of objects. We present a method for training a convolutional neural network such that it takes in an input image and produces a full graph definition. This is done end-to-end in a single stage with the use of associative embeddings. The network learns to simultaneously identify all of the elements that make up a graph and piece them together. We benchmark on the Visual Genome dataset, and demonstrate state-of-the-art performance on the challenging task of scene graph generation.

Cite

Text

Newell and Deng. "Pixels to Graphs by Associative Embedding." Neural Information Processing Systems, 2017.

Markdown

[Newell and Deng. "Pixels to Graphs by Associative Embedding." Neural Information Processing Systems, 2017.](https://mlanthology.org/neurips/2017/newell2017neurips-pixels/)

BibTeX

@inproceedings{newell2017neurips-pixels,
  title     = {{Pixels to Graphs by Associative Embedding}},
  author    = {Newell, Alejandro and Deng, Jia},
  booktitle = {Neural Information Processing Systems},
  year      = {2017},
  pages     = {2171-2180},
  url       = {https://mlanthology.org/neurips/2017/newell2017neurips-pixels/}
}