How Does Topology Influence Gradient Propagation and Model Performance of Deep Networks with DenseNet-Type Skip Connections?

Abstract

DenseNets introduce concatenation-type skip connections that achieve state-of-the-art accuracy in several computer vision tasks. In this paper, we reveal that the topology of the concatenation-type skip connections is closely related to the gradient propagation which, in turn, enables a predictable behavior of DNNs' test performance. To this end, we introduce a new metric called NN-Mass to quantify how effectively information flows through DNNs. Moreover, we empirically show that NN-Mass also works for other types of skip connections, e.g., for ResNets, Wide-ResNets (WRNs), and MobileNets, which contain addition-type skip connections (i.e., residuals or inverted residuals). As such, for both DenseNet-like CNNs and ResNets/WRNs/MobileNets, our theoretically grounded NN-Mass can identify models with similar accuracy, despite having significantly different size/compute requirements. Detailed experiments on both synthetic and real datasets (e.g., MNIST, CIFAR-10, CIFAR-100, ImageNet) provide extensive evidence for our insights. Finally, the closed-form equation of our NN-Mass enables us to design significantly compressed DenseNets (for CIFAR-10) and MobileNets (for ImageNet) directly at initialization without time-consuming training and/or searching.

Cite

Text

Bhardwaj et al. "How Does Topology Influence Gradient Propagation and Model Performance of Deep Networks with DenseNet-Type Skip Connections?." Conference on Computer Vision and Pattern Recognition, 2021. doi:10.1109/CVPR46437.2021.01329

Markdown

[Bhardwaj et al. "How Does Topology Influence Gradient Propagation and Model Performance of Deep Networks with DenseNet-Type Skip Connections?." Conference on Computer Vision and Pattern Recognition, 2021.](https://mlanthology.org/cvpr/2021/bhardwaj2021cvpr-topology/) doi:10.1109/CVPR46437.2021.01329

BibTeX

@inproceedings{bhardwaj2021cvpr-topology,
  title     = {{How Does Topology Influence Gradient Propagation and Model Performance of Deep Networks with DenseNet-Type Skip Connections?}},
  author    = {Bhardwaj, Kartikeya and Li, Guihong and Marculescu, Radu},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2021},
  pages     = {13498-13507},
  doi       = {10.1109/CVPR46437.2021.01329},
  url       = {https://mlanthology.org/cvpr/2021/bhardwaj2021cvpr-topology/}
}