Firefly Neural Architecture Descent: A General Approach for Growing Neural Networks

Abstract

We propose firefly neural architecture descent, a general framework for progressively and dynamically growing neural networks to jointly optimize the networks' parameters and architectures. Our method works in a steepest descent fashion, which iteratively finds the best network within a functional neighborhood of the original network that includes a diverse set of candidate network structures. By using Taylor approximation, the optimal network structure in the neighborhood can be found with a greedy selection procedure. We show that firefly descent can flexibly grow networks both wider and deeper, and can be applied to learn accurate but resource-efficient neural architectures that avoid catastrophic forgetting in continual learning. Empirically, firefly descent achieves promising results on both neural architecture search and continual learning. In particular, on a challenging continual image classification task, it learns networks that are smaller in size but have higher average accuracy than those learned by the state-of-the-art methods.

Cite

Text

Wu et al. "Firefly Neural Architecture Descent: A General Approach for Growing Neural Networks." Neural Information Processing Systems, 2020.

Markdown

[Wu et al. "Firefly Neural Architecture Descent: A General Approach for Growing Neural Networks." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/wu2020neurips-firefly/)

BibTeX

@inproceedings{wu2020neurips-firefly,
  title     = {{Firefly Neural Architecture Descent: A General Approach for Growing Neural Networks}},
  author    = {Wu, Lemeng and Liu, Bo and Stone, Peter and Liu, Qiang},
  booktitle = {Neural Information Processing Systems},
  year      = {2020},
  url       = {https://mlanthology.org/neurips/2020/wu2020neurips-firefly/}
}