Top-Down Visual Saliency via Joint CRF and Dictionary Learning
Abstract
Top-down visual saliency facilities object localization by providing a discriminative representation of target objects and a probability map for reducing the search space. In this paper, we propose a novel top-down saliency model that jointly learns a Conditional Random Field (CRF) and a discriminative dictionary. The proposed model is formulated based on a CRF with latent variables. By using sparse codes as latent variables, we train the dictionary modulated by CRF, and meanwhile a CRF with sparse coding. We propose a max-margin approach to train our model via fast inference algorithms. We evaluate our model on the Graz-02 and PASCAL VOC 2007 datasets. Experimental results show that our model performs favorably against the state-of-the-art top-down saliency methods. We also observe that the dictionary update significantly improves the model performance.
Cite
Text
Yang and Yang. "Top-Down Visual Saliency via Joint CRF and Dictionary Learning." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2012. doi:10.1109/CVPR.2012.6247940Markdown
[Yang and Yang. "Top-Down Visual Saliency via Joint CRF and Dictionary Learning." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2012.](https://mlanthology.org/cvpr/2012/yang2012cvpr-top/) doi:10.1109/CVPR.2012.6247940BibTeX
@inproceedings{yang2012cvpr-top,
title = {{Top-Down Visual Saliency via Joint CRF and Dictionary Learning}},
author = {Yang, Jimei and Yang, Ming-Hsuan},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year = {2012},
pages = {2296-2303},
doi = {10.1109/CVPR.2012.6247940},
url = {https://mlanthology.org/cvpr/2012/yang2012cvpr-top/}
}