Efficient Neural Architecture Search via Parameters Sharing

Abstract

We propose Efficient Neural Architecture Search (ENAS), a fast and inexpensive approach for automatic model design. ENAS constructs a large computational graph, where each subgraph represents a neural network architecture, hence forcing all architectures to share their parameters. A controller is trained with policy gradient to search for a subgraph that maximizes the expected reward on a validation set. Meanwhile a model corresponding to the selected subgraph is trained to minimize a canonical cross entropy loss. Sharing parameters among child models allows ENAS to deliver strong empirical performances, whilst using much fewer GPU-hours than existing automatic model design approaches, and notably, 1000x less expensive than standard Neural Architecture Search. On Penn Treebank, ENAS discovers a novel architecture that achieves a test perplexity of 56.3, on par with the existing state-of-the-art among all methods without post-training processing. On CIFAR-10, ENAS finds a novel architecture that achieves 2.89% test error, which is on par with the 2.65% test error of NASNet (Zoph et al., 2018).

Cite

Text

Pham et al. "Efficient Neural Architecture Search via Parameters Sharing." International Conference on Machine Learning, 2018.

Markdown

[Pham et al. "Efficient Neural Architecture Search via Parameters Sharing." International Conference on Machine Learning, 2018.](https://mlanthology.org/icml/2018/pham2018icml-efficient/)

BibTeX

@inproceedings{pham2018icml-efficient,
  title     = {{Efficient Neural Architecture Search via Parameters Sharing}},
  author    = {Pham, Hieu and Guan, Melody and Zoph, Barret and Le, Quoc and Dean, Jeff},
  booktitle = {International Conference on Machine Learning},
  year      = {2018},
  pages     = {4095-4104},
  volume    = {80},
  url       = {https://mlanthology.org/icml/2018/pham2018icml-efficient/}
}