Images as Bags of Pixels

Jebara, Tony

doi:10.1109/ICCV.2003.1238352

Images as Bags of Pixels

Tony Jebara

ICCV 2003 pp. 265-272

doi:10.1109/ICCV.2003.1238352 /iccv/2003/jebara2003iccv-images/

Abstract

We propose modeling images and related visual objects as bags of pixels or sets of vectors. For instance, gray scale images are modeled as a collection or bag of (X, Y, I) pixel vectors. This representation implies a permutational invariance over the bag of pixels, which is naturally handled by endowing each image with a permutation matrix. Each matrix permits the image to span a manifold of multiple configurations, capturing the vector set's invariance to orderings or permutation transformations. Permutation configurations are optimized while jointly modeling many images via maximum likelihood. The solution is a uniquely solvable convex program, which computes correspondence simultaneously for all images (as opposed to traditional pairwise correspondence solutions). Maximum likelihood performs a nonlinear dimensionality reduction, choosing permutations that compact the permuted image vectors into a volumetrically minimal subspace. This is highly suitable for principal components analysis which, when applied to the permutationally invariant bag of pixels representation, outperforms PCA on appearance-based vectorization by orders of magnitude. Furthermore, the bag of pixels subspace benefits from automatic correspondence estimation, giving rise to meaningful linear variations such as morphings, translations, and jointly spatio-textural image transformations. Results are shown for several datasets.

ICCV Semantic Scholar

Cite

Text

Jebara. "Images as Bags of Pixels." IEEE/CVF International Conference on Computer Vision, 2003. doi:10.1109/ICCV.2003.1238352

Markdown

[Jebara. "Images as Bags of Pixels." IEEE/CVF International Conference on Computer Vision, 2003.](https://mlanthology.org/iccv/2003/jebara2003iccv-images/) doi:10.1109/ICCV.2003.1238352

BibTeX

@inproceedings{jebara2003iccv-images,
  title     = {{Images as Bags of Pixels}},
  author    = {Jebara, Tony},
  booktitle = {IEEE/CVF International Conference on Computer Vision},
  year      = {2003},
  pages     = {265-272},
  doi       = {10.1109/ICCV.2003.1238352},
  url       = {https://mlanthology.org/iccv/2003/jebara2003iccv-images/}
}