Encoding Based Saliency Detection for Videos and Images

Abstract

We present a novel video saliency detection method to support human activity recognition and weakly supervised training of activity detection algorithms. Recent research has emphasized the need for analyzing salient information in videos to minimize dataset bias or to supervise weakly labeled training of activity detectors. In contrast to previous methods we do not rely on training information given by either eye-gaze or annotation data, but propose a fully unsupervised algorithm to find salient regions within videos. In general, we enforce the Gestalt principle of figure-ground segregation for both appearance and motion cues. We introduce an encoding approach that allows for efficient computation of saliency by approximating joint feature distributions. We evaluate our approach on several datasets, including challenging scenarios with cluttered background and camera motion, as well as salient object detection in images. Overall, we demonstrate favorable performance compared to state-of-the-art methods in estimating both ground-truth eye-gaze and activity annotations.

Cite

Text

Mauthner et al. "Encoding Based Saliency Detection for Videos and Images." Conference on Computer Vision and Pattern Recognition, 2015. doi:10.1109/CVPR.2015.7298864

Markdown

[Mauthner et al. "Encoding Based Saliency Detection for Videos and Images." Conference on Computer Vision and Pattern Recognition, 2015.](https://mlanthology.org/cvpr/2015/mauthner2015cvpr-encoding/) doi:10.1109/CVPR.2015.7298864

BibTeX

@inproceedings{mauthner2015cvpr-encoding,
  title     = {{Encoding Based Saliency Detection for Videos and Images}},
  author    = {Mauthner, Thomas and Possegger, Horst and Waltner, Georg and Bischof, Horst},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2015},
  doi       = {10.1109/CVPR.2015.7298864},
  url       = {https://mlanthology.org/cvpr/2015/mauthner2015cvpr-encoding/}
}