Fisher and VLAD with FLAIR

Abstract

A major computational bottleneck in many current algorithms is the evaluation of arbitrary boxes. Dense local analysis and powerful bag-of-word encodings, such as Fisher vectors and VLAD, lead to improved accuracy at the expense of increased computation time. Where a simplification in the representation is tempting, we exploit novel representations while maintaining accuracy. We start from state-of-the-art, fast selective search, but our method will apply to any initial box-partitioning. By representing the picture as sparse integral images, one per codeword, we achieve a Fast Local Area Independent Representation. FLAIR allows for very fast evaluation of any box encoding and still enables spatial pooling. In FLAIR we achieve exact VLAD's difference coding, even with L2 and power-norms. Finally, by multiple codeword assignments, we achieve exact and approximate Fisher vectors with FLAIR. The results are a 18x speedup, which enables us to set a new state-of-the-art on the challenging 2010 PASCAL VOC objects and the fine-grained categorization of the CUB-2011 200 bird species. Plus, we rank number one in the official ImageNet 2013 detection challenge.

Cite

Text

van de Sande et al. "Fisher and VLAD with FLAIR." Conference on Computer Vision and Pattern Recognition, 2014. doi:10.1109/CVPR.2014.304

Markdown

[van de Sande et al. "Fisher and VLAD with FLAIR." Conference on Computer Vision and Pattern Recognition, 2014.](https://mlanthology.org/cvpr/2014/vandesande2014cvpr-fisher/) doi:10.1109/CVPR.2014.304

BibTeX

@inproceedings{vandesande2014cvpr-fisher,
  title     = {{Fisher and VLAD with FLAIR}},
  author    = {van de Sande, Koen E. A. and Snoek, Cees G. M. and Smeulders, Arnold W. M.},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2014},
  doi       = {10.1109/CVPR.2014.304},
  url       = {https://mlanthology.org/cvpr/2014/vandesande2014cvpr-fisher/}
}