Focusing Visual Relation Detection on Relevant Relations with Prior Potentials
Abstract
Understanding images relies on the understanding of how visible objects are linked to each other. Current approaches of Visual Relation Detection (VRD) are hindered by the high frequency of some relations: when an important focus is put on them, more meaningful ones are overlooked. We address this challenge by learning the relative relevance of relations, and integrating this term into a novel scene graph extraction scheme. We show that this allows our model to predict relations on fewer and more relevant object pairs. It outperforms MotifNet, a state of the art model, on the Visual Genome dataset. It increases the Class Macro recall, the metric we propose to use, from 38.1% to 44.4%. In addition, we propose a new split of Visual Genome, with a more balanced relation distribution, emphasizing on the detection of uncommon relations and validates the use of the previous metric. On this set, our model outperforms MotifNet on all metrics, e.g. from 39.6% to 44.0% at 10 predictions per image on the relation classification task.
Cite
Text
Plesse et al. "Focusing Visual Relation Detection on Relevant Relations with Prior Potentials." Winter Conference on Applications of Computer Vision, 2020.Markdown
[Plesse et al. "Focusing Visual Relation Detection on Relevant Relations with Prior Potentials." Winter Conference on Applications of Computer Vision, 2020.](https://mlanthology.org/wacv/2020/plesse2020wacv-focusing/)BibTeX
@inproceedings{plesse2020wacv-focusing,
title = {{Focusing Visual Relation Detection on Relevant Relations with Prior Potentials}},
author = {Plesse, Francois and Ginsca, Alexandru and Delezoide, Bertrand and Preteux, Francoise},
booktitle = {Winter Conference on Applications of Computer Vision},
year = {2020},
url = {https://mlanthology.org/wacv/2020/plesse2020wacv-focusing/}
}