Multicoated Supermasks Enhance Hidden Networks

Abstract

Hidden Networks (Ramanujan et al., 2020) showed the possibility of finding accurate subnetworks within a randomly weighted neural network by training a connectivity mask, referred to as supermask. We show that the supermask stops improving even though gradients are not zero, thus underutilizing backpropagated information. To address this we propose a method that extends Hidden Networks by training an overlay of multiple hierarchical supermasks{—}a multicoated supermask. This method shows that using multiple supermasks for a single task achieves higher accuracy without additional training cost. Experiments on CIFAR-10 and ImageNet show that Multicoated Supermasks enhance the tradeoff between accuracy and model size. A ResNet-101 using a 7-coated supermask outperforms its Hidden Networks counterpart by 4%, matching the accuracy of a dense ResNet-50 while being an order of magnitude smaller.

Cite

Text

Okoshi et al. "Multicoated Supermasks Enhance Hidden Networks." International Conference on Machine Learning, 2022.

Markdown

[Okoshi et al. "Multicoated Supermasks Enhance Hidden Networks." International Conference on Machine Learning, 2022.](https://mlanthology.org/icml/2022/okoshi2022icml-multicoated/)

BibTeX

@inproceedings{okoshi2022icml-multicoated,
  title     = {{Multicoated Supermasks Enhance Hidden Networks}},
  author    = {Okoshi, Yasuyuki and Garcı́a-Arias, Ángel López and Hirose, Kazutoshi and Ando, Kota and Kawamura, Kazushi and Van Chu, Thiem and Motomura, Masato and Yu, Jaehoon},
  booktitle = {International Conference on Machine Learning},
  year      = {2022},
  pages     = {17045-17055},
  volume    = {162},
  url       = {https://mlanthology.org/icml/2022/okoshi2022icml-multicoated/}
}