Joint Embeddings of Scene Graphs and Images

Belilovsky, Eugene; Blaschko, Matthew B.; Kiros, Jamie Ryan; Urtasun, Raquel; Zemel, Richard S.

Joint Embeddings of Scene Graphs and Images

Eugene Belilovsky, Matthew B. Blaschko, Jamie Ryan Kiros, Raquel Urtasun, Richard S. Zemel

ICLR 2017

/iclr/2017/belilovsky2017iclr-joint/

Abstract

Multimodal representations of text and images have become popular in recent years. Text however has inherent ambiguities when describing visual scenes, leading to the recent development of datasets with detailed graphical descriptions in the form of scene graphs. We consider the task of joint representation of semantically precise scene graphs and images. We propose models for representing scene graphs and aligning them with images. We investigate methods based on bag-of-words, subpath representations, as well as neural networks. Our investigation proposes and contrasts several models which can address this task and highlights some unique challenges in both designing models and evaluation.

PDF ICLR Semantic Scholar

Cite

Text

Belilovsky et al. "Joint Embeddings of Scene Graphs and Images." International Conference on Learning Representations, 2017.

Markdown

[Belilovsky et al. "Joint Embeddings of Scene Graphs and Images." International Conference on Learning Representations, 2017.](https://mlanthology.org/iclr/2017/belilovsky2017iclr-joint/)

BibTeX

@inproceedings{belilovsky2017iclr-joint,
  title     = {{Joint Embeddings of Scene Graphs and Images}},
  author    = {Belilovsky, Eugene and Blaschko, Matthew B. and Kiros, Jamie Ryan and Urtasun, Raquel and Zemel, Richard S.},
  booktitle = {International Conference on Learning Representations},
  year      = {2017},
  url       = {https://mlanthology.org/iclr/2017/belilovsky2017iclr-joint/}
}