On the Regularization of Image Semantics by Modal Expansion
Abstract
Recent research efforts in semantic representations and context modeling are based on the principle of task expansion: that vision problems such as object recognition, scene classification, or retrieval (RCR) cannot be solved in isolation. The extended principle of modality expansion (that RCR problems cannot be solved from visual information alone) is investigated in this work. A semantic image labeling system is augmented with text. Pairs of images and text are mapped to a semantic space, and the text features used to regularize their image counterparts. This is done with a new cross-modal regularizer, which learns the mapping of the image features that maximizes their average similarity to those derived from text. The proposed regularizer is class-sensitive, combining a set of class-specific denoising transformations and nearest neighbor interpolation of text-based class assignments. Regularization of a state-of-the-art approach to image retrieval is then shown to produce substantial gains in retrieval accuracy, outperforming recent image retrieval approaches.
Cite
Text
Pereira and Vasconcelos. "On the Regularization of Image Semantics by Modal Expansion." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2012. doi:10.1109/CVPR.2012.6248041Markdown
[Pereira and Vasconcelos. "On the Regularization of Image Semantics by Modal Expansion." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2012.](https://mlanthology.org/cvpr/2012/pereira2012cvpr-regularization/) doi:10.1109/CVPR.2012.6248041BibTeX
@inproceedings{pereira2012cvpr-regularization,
title = {{On the Regularization of Image Semantics by Modal Expansion}},
author = {Pereira, José Costa and Vasconcelos, Nuno},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year = {2012},
pages = {3093-3099},
doi = {10.1109/CVPR.2012.6248041},
url = {https://mlanthology.org/cvpr/2012/pereira2012cvpr-regularization/}
}