Where and Why Are They Looking? Jointly Inferring Human Attention and Intentions in Complex Tasks

Abstract

This paper addresses a new problem - jointly inferring human attention, intentions, and tasks from videos. Given an RGB-D video where a human performs a task, we answer three questions simultaneously: 1) where the human is looking - attention prediction; 2) why the human is looking there - intention prediction; and 3) what task the human is performing - task recognition. We propose a hierarchical model of human-attention-object (HAO) which represents tasks, intentions, and attention under a unified framework. A task is represented as sequential intentions which transition to each other. An intention is composed of the human pose, attention, and objects. A beam search algorithm is adopted for inference on the HAO graph to output the attention, intention, and task results. We built a new video dataset of tasks, intentions, and attention. It contains 14 task classes, 70 intention categories, 28 object classes, 809 videos, and approximately 330,000 frames. Experiments show that our approach outperforms existing approaches.

Cite

Text

Wei et al. "Where and Why Are They Looking? Jointly Inferring Human Attention and Intentions in Complex Tasks." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018. doi:10.1109/CVPR.2018.00711

Markdown

[Wei et al. "Where and Why Are They Looking? Jointly Inferring Human Attention and Intentions in Complex Tasks." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018.](https://mlanthology.org/cvpr/2018/wei2018cvpr-they/) doi:10.1109/CVPR.2018.00711

BibTeX

@inproceedings{wei2018cvpr-they,
  title     = {{Where and Why Are They Looking? Jointly Inferring Human Attention and Intentions in Complex Tasks}},
  author    = {Wei, Ping and Liu, Yang and Shu, Tianmin and Zheng, Nanning and Zhu, Song-Chun},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2018},
  doi       = {10.1109/CVPR.2018.00711},
  url       = {https://mlanthology.org/cvpr/2018/wei2018cvpr-they/}
}