HCVRD: A Benchmark for Large-Scale Human-Centered Visual Relationship Detection

Abstract

Visual relationship detection aims to capture interactions between pairs of objects in images. Relationships between objects and humans represent a particularly important subset of this problem, with implications for challenges such as understanding human behavior, and identifying affordances, amongst others. In addressing this problem we first construct a large-scale human-centric visual relationship detection dataset (HCVRD), which provides many more types of relationship annotations (nearly 10K categories) than the previous released datasets. This large label space better reflects the reality of human-object interactions, but gives rise to a long-tail distribution problem, which in turn demands a zero-shot approach to labels appearing only in the test set. This is the first time this issue has been addressed. We propose a webly-supervised approach to these problems and demonstrate that the proposed model provides a strong baseline on our HCVRD dataset.

Cite

Text

Zhuang et al. "HCVRD: A Benchmark for Large-Scale Human-Centered Visual Relationship Detection." AAAI Conference on Artificial Intelligence, 2018. doi:10.1609/AAAI.V32I1.12260

Markdown

[Zhuang et al. "HCVRD: A Benchmark for Large-Scale Human-Centered Visual Relationship Detection." AAAI Conference on Artificial Intelligence, 2018.](https://mlanthology.org/aaai/2018/zhuang2018aaai-hcvrd/) doi:10.1609/AAAI.V32I1.12260

BibTeX

@inproceedings{zhuang2018aaai-hcvrd,
  title     = {{HCVRD: A Benchmark for Large-Scale Human-Centered Visual Relationship Detection}},
  author    = {Zhuang, Bohan and Wu, Qi and Shen, Chunhua and Reid, Ian D. and van den Hengel, Anton},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2018},
  pages     = {7631-7638},
  doi       = {10.1609/AAAI.V32I1.12260},
  url       = {https://mlanthology.org/aaai/2018/zhuang2018aaai-hcvrd/}
}