Object Grounding via Iterative Context Reasoning
Abstract
In this paper, we tackle the problem of weakly-supervised object grounding. For an image and a set of queries extracted from its description, the goal is to localize each query in the image. In a weakly-supervised setting, ground-truth query groundings are not accessible at training time. We propose a novel approach for weakly-supervised object grounding through iterative context reasoning in which we update query representations and region representations iteratively conditioning on each other. Such iterative contextual refinement gradually resolves ambiguity and vagueness in the queries and regions, thus helping to resolve challenges in grounding. We show the effectiveness of our proposed model on two challenging video object grounding datasets.
Cite
Text
Chen et al. "Object Grounding via Iterative Context Reasoning." IEEE/CVF International Conference on Computer Vision Workshops, 2019. doi:10.1109/ICCVW.2019.00177Markdown
[Chen et al. "Object Grounding via Iterative Context Reasoning." IEEE/CVF International Conference on Computer Vision Workshops, 2019.](https://mlanthology.org/iccvw/2019/chen2019iccvw-object/) doi:10.1109/ICCVW.2019.00177BibTeX
@inproceedings{chen2019iccvw-object,
title = {{Object Grounding via Iterative Context Reasoning}},
author = {Chen, Lei and Zhai, Mengyao and He, Jiawei and Mori, Greg},
booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
year = {2019},
pages = {1407-1415},
doi = {10.1109/ICCVW.2019.00177},
url = {https://mlanthology.org/iccvw/2019/chen2019iccvw-object/}
}