Object-Aware Gaze Target Detection
Abstract
Gaze target detection aims to predict the image location where the person is looking and the probability that a gaze is out of the scene. Several works have tackled this task by regressing a gaze heatmap centered on the gaze location, however, they overlooked decoding the relationship between the people and the gazed objects. This paper proposes a Transformer-based architecture that automatically detects objects (including heads) in the scene to build associations between every head and the gazed-head/object, resulting in a comprehensive, explainable gaze analysis composed of: gaze target area, gaze pixel point, the class and the image location of the gazed-object. Upon evaluation of the in-the-wild benchmarks, our method achieves state-of-the-art results on all metrics (up to 2.91% gain in AUC, 50% reduction in gaze distance, and 9% gain in out-of-frame average precision) for gaze target detection and 11-13% improvement in average precision for the classification and the localization of the gazed-objects. The code of the proposed method is publicly available.
Cite
Text
Tonini et al. "Object-Aware Gaze Target Detection." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.01998Markdown
[Tonini et al. "Object-Aware Gaze Target Detection." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/tonini2023iccv-objectaware/) doi:10.1109/ICCV51070.2023.01998BibTeX
@inproceedings{tonini2023iccv-objectaware,
title = {{Object-Aware Gaze Target Detection}},
author = {Tonini, Francesco and Dall'Asen, Nicola and Beyan, Cigdem and Ricci, Elisa},
booktitle = {International Conference on Computer Vision},
year = {2023},
pages = {21860-21869},
doi = {10.1109/ICCV51070.2023.01998},
url = {https://mlanthology.org/iccv/2023/tonini2023iccv-objectaware/}
}