Harvesting Mid-Level Visual Concepts from Large-Scale Internet Images
Abstract
Obtaining effective mid-level representations has become an increasingly important task in computer vision. In this paper, we propose a fully automatic algorithm which harvests visual concepts from a large number of Internet images (more than a quarter of a million) using text-based queries. Existing approaches to visual concept learning from Internet images either rely on strong supervision with detailed manual annotations or learn image-level classifiers only. Here, we take the advantage of having massive wellorganized Google and Bing image data; visual concepts (around 14, 000) are automatically exploited from images using word-based queries. Using the learned visual concepts, we show state-of-the-art performances on a variety of benchmark datasets, which demonstrate the effectiveness of the learned mid-level representations: being able to generalize well to general natural images. Our method shows significant improvement over the competing systems in image classification, including those with strong supervision.
Cite
Text
Li et al. "Harvesting Mid-Level Visual Concepts from Large-Scale Internet Images." Conference on Computer Vision and Pattern Recognition, 2013. doi:10.1109/CVPR.2013.115Markdown
[Li et al. "Harvesting Mid-Level Visual Concepts from Large-Scale Internet Images." Conference on Computer Vision and Pattern Recognition, 2013.](https://mlanthology.org/cvpr/2013/li2013cvpr-harvesting/) doi:10.1109/CVPR.2013.115BibTeX
@inproceedings{li2013cvpr-harvesting,
title = {{Harvesting Mid-Level Visual Concepts from Large-Scale Internet Images}},
author = {Li, Quannan and Wu, Jiajun and Tu, Zhuowen},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2013},
doi = {10.1109/CVPR.2013.115},
url = {https://mlanthology.org/cvpr/2013/li2013cvpr-harvesting/}
}