Robust Mixture-of-Expert Training for Convolutional Neural Networks

Abstract

Sparsely-gated Mixture of Expert (MoE), an emerging deep model architecture, has demonstrated a great promise to enable high-accuracy and ultra-efficient model inference. Despite the growing popularity of MoE, little work investigated its potential to advance convolutional neural networks (CNNs), especially in the plane of adversarial robustness. Since the lack of robustness has become one of the main hurdles for CNNs, in this paper we ask: How to adversarially robustify a CNN-based MoE model? Can we robustly train it like an ordinary CNN model? Our pilot study shows that the conventional adversarial training (AT) mechanism (developed for vanilla CNNs) no longer remains effective to robustify an MoE-CNN. To better understand this phenomenon, we dissect the robustness of an MoE-CNN into two dimensions: Robustness of routers (i.e., gating functions to select data-specific experts) and robustness of experts (i.e., the router-guided pathways defined by the subnetworks of the backbone CNN). Our analyses show that routers and experts are hard to adapt to each other in the vanilla AT. Thus, we propose a new router-expert alternating Adversarial training framework for MoE, termed AdvMoE. The effectiveness of our proposal is justified across 4 commonly-used CNN model architectures over 4 benchmark datasets. We find that AdvMoE achieves 1% 4% adversarial robustness improvement over the original dense CNN, and enjoys the efficiency merit of sparsity-gated MoE, leading to more than 50% inference cost reduction.

Cite

Text

Zhang et al. "Robust Mixture-of-Expert Training for Convolutional Neural Networks." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00015

Markdown

[Zhang et al. "Robust Mixture-of-Expert Training for Convolutional Neural Networks." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/zhang2023iccv-robust-a/) doi:10.1109/ICCV51070.2023.00015

BibTeX

@inproceedings{zhang2023iccv-robust-a,
  title     = {{Robust Mixture-of-Expert Training for Convolutional Neural Networks}},
  author    = {Zhang, Yihua and Cai, Ruisi and Chen, Tianlong and Zhang, Guanhua and Zhang, Huan and Chen, Pin-Yu and Chang, Shiyu and Wang, Zhangyang and Liu, Sijia},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {90-101},
  doi       = {10.1109/ICCV51070.2023.00015},
  url       = {https://mlanthology.org/iccv/2023/zhang2023iccv-robust-a/}
}