DNArch: Learning Convolutional Neural Architectures by Backpropagation

Abstract

We present *Differentiable Neural Architectures* (DNArch), a method that learns the weights and the architecture of CNNs jointly by backpropagation. DNArch enables learning (*i*) the size of convolutional kernels, (*ii*) the width of all layers, (*iii*) the position and value of downsampling layers, and (*iv*) the depth of the network. DNArch treats neural architectures as continuous entities and uses learnable differentiable masks to control their size. Unlike existing methods, DNArch is not limited to a (small) predefined set of possible components, but instead it is able to discover CNN architectures across all feasible combinations of kernel sizes, widths, depths and downsampling. Empirically, DNArch finds effective architectures for classification and dense prediction tasks on sequential and image data. By adding a loss term that controls the network complexity, DNArch constrains its search to architectures that respect a predefined computational budget during training.

Cite

Text

Romero and Zeghidour. "DNArch: Learning Convolutional Neural Architectures by Backpropagation." ICML 2023 Workshops: Differentiable_Almost_Everything, 2023.

Markdown

[Romero and Zeghidour. "DNArch: Learning Convolutional Neural Architectures by Backpropagation." ICML 2023 Workshops: Differentiable_Almost_Everything, 2023.](https://mlanthology.org/icmlw/2023/romero2023icmlw-dnarch/)

BibTeX

@inproceedings{romero2023icmlw-dnarch,
  title     = {{DNArch: Learning Convolutional Neural Architectures by Backpropagation}},
  author    = {Romero, David W. and Zeghidour, Neil},
  booktitle = {ICML 2023 Workshops: Differentiable_Almost_Everything},
  year      = {2023},
  url       = {https://mlanthology.org/icmlw/2023/romero2023icmlw-dnarch/}
}