Modeling Images as Mixtures of Reference Images

Abstract

A state-of-the-art approach to measure the similarity of two images is to model each image by a continuous distribution, generally a Gaussian mixture model (GMM), and to compute a probabilistic similarity between the GMMs. One limitation of traditional measures such as the Kullback-Leibler (KL) divergence and the probability product kernel (PPK) is that they measure a global match of distributions. This paper introduces a novel image representation. We propose to approximate an image, modeled by a GMM, as a convex combination of K reference image GMMs, and then to describe the image as the K-dimensional vector of mixture weights. The computed weights encode a similarity that favors local matches (i.e. matches of individual Gaussians) and is therefore fundamentally different from the KL or PPK. Although the computation of the mixture weights is a convex optimization problem, its direct optimization is difficult. We propose two approximate optimization algorithms: the first one based on traditional sampling methods, the second one based on a variational bound approximation of the true objective function. We apply this novel representation to the image categorization problem and compare its performance to traditional kernel-based methods. We demonstrate on the PASCAL VOC 2007 dataset a consistent increase in classification accuracy.

Cite

Text

Perronnin and Liu. "Modeling Images as Mixtures of Reference Images." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2009. doi:10.1109/CVPR.2009.5206781

Markdown

[Perronnin and Liu. "Modeling Images as Mixtures of Reference Images." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2009.](https://mlanthology.org/cvpr/2009/perronnin2009cvpr-modeling/) doi:10.1109/CVPR.2009.5206781

BibTeX

@inproceedings{perronnin2009cvpr-modeling,
  title     = {{Modeling Images as Mixtures of Reference Images}},
  author    = {Perronnin, Florent and Liu, Yan},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2009},
  pages     = {1770-1777},
  doi       = {10.1109/CVPR.2009.5206781},
  url       = {https://mlanthology.org/cvpr/2009/perronnin2009cvpr-modeling/}
}