Multi-Scale Recognition with DAG-CNNs

Abstract

We explore multi-scale convolutional neural nets (CNNs) for image classification. Contemporary approaches extract features from a single output layer. By extracting features from multiple layers, one can simultaneously reason about high, mid, and low-level features during classification. The resulting multi-scale architecture can itself be seen as a feed-forward model that is structured as a directed acyclic graph (DAG-CNNs). We use DAG-CNNs to learn a set of multi-scale features that can be effectively shared between coarse and fine-grained classification tasks. While fine-tuning such models helps performance, we show that even "off-the-self" multi-scale features perform quite well. We present extensive analysis and demonstrate state-of-the-art classification performance on three standard scene benchmarks (SUN397, MIT67, and Scene15). In terms of the heavily benchmarked MIT67 and Scene15 datasets, our results reduce the lowest previously-reported error by 23.9% and 9.5%, respectively.

Cite

Text

Yang and Ramanan. "Multi-Scale Recognition with DAG-CNNs." International Conference on Computer Vision, 2015. doi:10.1109/ICCV.2015.144

Markdown

[Yang and Ramanan. "Multi-Scale Recognition with DAG-CNNs." International Conference on Computer Vision, 2015.](https://mlanthology.org/iccv/2015/yang2015iccv-multiscale/) doi:10.1109/ICCV.2015.144

BibTeX

@inproceedings{yang2015iccv-multiscale,
  title     = {{Multi-Scale Recognition with DAG-CNNs}},
  author    = {Yang, Songfan and Ramanan, Deva},
  booktitle = {International Conference on Computer Vision},
  year      = {2015},
  doi       = {10.1109/ICCV.2015.144},
  url       = {https://mlanthology.org/iccv/2015/yang2015iccv-multiscale/}
}