Large-Scale R-CNN with Classifier Adaptive Quantization

Abstract

This paper extends R-CNN, a state-of-the-art object detection method, to larger scales. To apply R-CNN to a large database storing thousands to millions of images, the SVM classification of millions to billions of DCNN features extracted from object proposals is indispensable, which imposes unrealistic computational and memory costs. Our method dramatically narrows down the number of object proposals by using an inverted index and efficiently searches by using residual vector quantization (RVQ). Instead of k-means that has been used in inverted indices, we present a novel quantization method designed for linear classification wherein the quantization error is re-defined for linear classification. It approximates the error as the empirical error with pre-defined multiple exemplar classifiers and captures the variance and common attributes of object category classifiers effectively. Experimental results show that our method achieves comparable performance to that of applying R-CNN to all images while achieving a 250 times speed-up and 180 times memory reduction. Moreover, our approach significantly outperforms the state-of-the-art large-scale category detection method, with about a 40 $\sim $ ∼ 58 % increase in top-K precision. Scalability is also validated, and we demonstrate that our method can process 100 K images in 0.13 s while retaining precision.

Cite

Text

Hinami and Satoh. "Large-Scale R-CNN with Classifier Adaptive Quantization." European Conference on Computer Vision, 2016. doi:10.1007/978-3-319-46487-9_25

Markdown

[Hinami and Satoh. "Large-Scale R-CNN with Classifier Adaptive Quantization." European Conference on Computer Vision, 2016.](https://mlanthology.org/eccv/2016/hinami2016eccv-large/) doi:10.1007/978-3-319-46487-9_25

BibTeX

@inproceedings{hinami2016eccv-large,
  title     = {{Large-Scale R-CNN with Classifier Adaptive Quantization}},
  author    = {Hinami, Ryota and Satoh, Shin'ichi},
  booktitle = {European Conference on Computer Vision},
  year      = {2016},
  pages     = {403-419},
  doi       = {10.1007/978-3-319-46487-9_25},
  url       = {https://mlanthology.org/eccv/2016/hinami2016eccv-large/}
}