Learning Video Saliency from Human Gaze Using Candidate Selection
Abstract
During recent years remarkable progress has been made in visual saliency modeling. Our interest is in video saliency. Since videos are fundamentally different from still images, they are viewed differently by human observers. For example, the time each video frame is observed is a fraction of a second, while a still image can be viewed leisurely. Therefore, video saliency estimation methods should differ substantially from image saliency methods. In this paper we propose a novel method for video saliency estimation, which is inspired by the way people watch videos. We explicitly model the continuity of the video by predicting the saliency map of a given frame, conditioned on the map from the previous frame. Furthermore, accuracy and computation speed are improved by restricting the salient locations to a carefully selected candidate set. We validate our method using two gaze-tracked video datasets and show we outperform the state-of-the-art.
Cite
Text
Rudoy et al. "Learning Video Saliency from Human Gaze Using Candidate Selection." Conference on Computer Vision and Pattern Recognition, 2013. doi:10.1109/CVPR.2013.152Markdown
[Rudoy et al. "Learning Video Saliency from Human Gaze Using Candidate Selection." Conference on Computer Vision and Pattern Recognition, 2013.](https://mlanthology.org/cvpr/2013/rudoy2013cvpr-learning/) doi:10.1109/CVPR.2013.152BibTeX
@inproceedings{rudoy2013cvpr-learning,
title = {{Learning Video Saliency from Human Gaze Using Candidate Selection}},
author = {Rudoy, Dmitry and Goldman, Dan B. and Shechtman, Eli and Zelnik-Manor, Lihi},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2013},
doi = {10.1109/CVPR.2013.152},
url = {https://mlanthology.org/cvpr/2013/rudoy2013cvpr-learning/}
}