Deep Co-Occurrence Feature Learning for Visual Object Recognition

Ya-Fang Shih, Yang-Ming Yeh, Yen-Yu Lin, Ming-Fang Weng, Yi-Chang Lu, Yung-Yu Chuang

CVPR 2017

doi:10.1109/CVPR.2017.772 /cvpr/2017/shih2017cvpr-deep/

Abstract

This paper addresses three issues in integrating part-based representations into convolutional neural networks (CNNs) for object recognition. First, most part-based models rely on a few pre-specified object parts. However, the optimal object parts for recognition often vary from category to category. Second, acquiring training data with part-level annotation is labor-intensive. Third, modeling spatial relationships between parts in CNNs often involves an exhaustive search of part templates over multiple network streams. We tackle the three issues by introducing a new network layer, called co-occurrence layer. It can extend a convolutional layer to encode the co-occurrence between the visual parts detected by the numerous neurons, instead of a few pre-specified parts. To this end, the feature maps serve as both filters and images, and mutual correlation filtering is conducted between them. The co-occurrence layer is end-to-end trainable. The resultant co-occurrence features are rotation- and translation-invariant, and are robust to object deformation. By applying this new layer to the VGG-16 and ResNet-152, we achieve the recognition rates of 83.6% and 85.8% on the Caltech-UCSD bird benchmark, respectively. The source code is available at https://github.com/yafangshih/Deep-COOC.

PDF CVPR Semantic Scholar

Cite

Text

Shih et al. "Deep Co-Occurrence Feature Learning for Visual Object Recognition." Conference on Computer Vision and Pattern Recognition, 2017. doi:10.1109/CVPR.2017.772

Markdown

[Shih et al. "Deep Co-Occurrence Feature Learning for Visual Object Recognition." Conference on Computer Vision and Pattern Recognition, 2017.](https://mlanthology.org/cvpr/2017/shih2017cvpr-deep/) doi:10.1109/CVPR.2017.772

BibTeX

@inproceedings{shih2017cvpr-deep,
  title     = {{Deep Co-Occurrence Feature Learning for Visual Object Recognition}},
  author    = {Shih, Ya-Fang and Yeh, Yang-Ming and Lin, Yen-Yu and Weng, Ming-Fang and Lu, Yi-Chang and Chuang, Yung-Yu},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2017},
  doi       = {10.1109/CVPR.2017.772},
  url       = {https://mlanthology.org/cvpr/2017/shih2017cvpr-deep/}
}