DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network

Xuan Shen, Yaohua Wang, Ming Lin, Yilun Huang, Hao Tang, Xiuyu Sun, Yanzhi Wang

CVPR 2023 pp. 6163-6173

doi:10.1109/CVPR52729.2023.00597 /cvpr/2023/shen2023cvpr-deepmad/

Abstract

The rapid advances in Vision Transformer (ViT) refresh the state-of-the-art performances in various vision tasks, overshadowing the conventional CNN-based models. This ignites a few recent striking-back research in the CNN world showing that pure CNN models can achieve as good performance as ViT models when carefully tuned. While encouraging, designing such high-performance CNN models is challenging, requiring non-trivial prior knowledge of network design. To this end, a novel framework termed Mathematical Architecture Design for Deep CNN (DeepMAD) is proposed to design high-performance CNN models in a principled way. In DeepMAD, a CNN network is modeled as an information processing system whose expressiveness and effectiveness can be analytically formulated by their structural parameters. Then a constrained mathematical programming (MP) problem is proposed to optimize these structural parameters. The MP problem can be easily solved by off-the-shelf MP solvers on CPUs with a small memory footprint. In addition, DeepMAD is a pure mathematical framework: no GPU or training data is required during network design. The superiority of DeepMAD is validated on multiple large-scale computer vision benchmark datasets. Notably on ImageNet-1k, only using conventional convolutional layers, DeepMAD achieves 0.7% and 1.5% higher top-1 accuracy than ConvNeXt and Swin on Tiny level, and 0.8% and 0.9% higher on Small level.

PDF CVPR Semantic Scholar

Cite

Text

Shen et al. "DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.00597

Markdown

[Shen et al. "DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/shen2023cvpr-deepmad/) doi:10.1109/CVPR52729.2023.00597

BibTeX

@inproceedings{shen2023cvpr-deepmad,
  title     = {{DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network}},
  author    = {Shen, Xuan and Wang, Yaohua and Lin, Ming and Huang, Yilun and Tang, Hao and Sun, Xiuyu and Wang, Yanzhi},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2023},
  pages     = {6163-6173},
  doi       = {10.1109/CVPR52729.2023.00597},
  url       = {https://mlanthology.org/cvpr/2023/shen2023cvpr-deepmad/}
}