Object Grounding via Iterative Context Reasoning

Abstract

In this paper, we tackle the problem of weakly-supervised object grounding. For an image and a set of queries extracted from its description, the goal is to localize each query in the image. In a weakly-supervised setting, ground-truth query groundings are not accessible at training time. We propose a novel approach for weakly-supervised object grounding through iterative context reasoning in which we update query representations and region representations iteratively conditioning on each other. Such iterative contextual refinement gradually resolves ambiguity and vagueness in the queries and regions, thus helping to resolve challenges in grounding. We show the effectiveness of our proposed model on two challenging video object grounding datasets.

Cite

Text

Chen et al. "Object Grounding via Iterative Context Reasoning." IEEE/CVF International Conference on Computer Vision Workshops, 2019. doi:10.1109/ICCVW.2019.00177

Markdown

[Chen et al. "Object Grounding via Iterative Context Reasoning." IEEE/CVF International Conference on Computer Vision Workshops, 2019.](https://mlanthology.org/iccvw/2019/chen2019iccvw-object/) doi:10.1109/ICCVW.2019.00177

BibTeX

@inproceedings{chen2019iccvw-object,
  title     = {{Object Grounding via Iterative Context Reasoning}},
  author    = {Chen, Lei and Zhai, Mengyao and He, Jiawei and Mori, Greg},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
  year      = {2019},
  pages     = {1407-1415},
  doi       = {10.1109/ICCVW.2019.00177},
  url       = {https://mlanthology.org/iccvw/2019/chen2019iccvw-object/}
}