Compact Bilinear Pooling

Abstract

Bilinear models has been shown to achieve impressive performance on a wide range of visual tasks, such as semantic segmentation, fine grained recognition and face recognition. However, bilinear features are high dimensional, typically on the order of hundreds of thousands to a few million, which makes them impractical for subsequent analysis. We propose two compact bilinear representations with the same discriminative power as the full bilinear representation but with only a few thousand dimensions. Our compact representations allow back-propagation of classification errors enabling an end-to-end optimization of the visual recognition system. The compact bilinear representations are derived through a novel kernelized analysis of bilinear pooling which provide insights into the discriminative power of bilinear pooling, and a platform for further research in compact pooling methods. Experimentation illustrate the utility of the proposed representations for image classification and few-shot learning across several datasets.

Cite

Text

Gao et al. "Compact Bilinear Pooling." Conference on Computer Vision and Pattern Recognition, 2016. doi:10.1109/CVPR.2016.41

Markdown

[Gao et al. "Compact Bilinear Pooling." Conference on Computer Vision and Pattern Recognition, 2016.](https://mlanthology.org/cvpr/2016/gao2016cvpr-compact/) doi:10.1109/CVPR.2016.41

BibTeX

@inproceedings{gao2016cvpr-compact,
  title     = {{Compact Bilinear Pooling}},
  author    = {Gao, Yang and Beijbom, Oscar and Zhang, Ning and Darrell, Trevor},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2016},
  doi       = {10.1109/CVPR.2016.41},
  url       = {https://mlanthology.org/cvpr/2016/gao2016cvpr-compact/}
}