Learning with Pseudo-Ensembles

Abstract

We formalize the notion of a pseudo-ensemble, a (possibly infinite) collection of child models spawned from a parent model by perturbing it according to some noise process. E.g., dropout (Hinton et al, 2012) in a deep neural network trains a pseudo-ensemble of child subnetworks generated by randomly masking nodes in the parent network. We examine the relationship of pseudo-ensembles, which involve perturbation in model-space, to standard ensemble methods and existing notions of robustness, which focus on perturbation in observation-space. We present a novel regularizer based on making the behavior of a pseudo-ensemble robust with respect to the noise process generating it. In the fully-supervised setting, our regularizer matches the performance of dropout. But, unlike dropout, our regularizer naturally extends to the semi-supervised setting, where it produces state-of-the-art results. We provide a case study in which we transform the Recursive Neural Tensor Network of (Socher et al, 2013) into a pseudo-ensemble, which significantly improves its performance on a real-world sentiment analysis benchmark.

Cite

Text

Bachman et al. "Learning with Pseudo-Ensembles." Neural Information Processing Systems, 2014.

Markdown

[Bachman et al. "Learning with Pseudo-Ensembles." Neural Information Processing Systems, 2014.](https://mlanthology.org/neurips/2014/bachman2014neurips-learning/)

BibTeX

@inproceedings{bachman2014neurips-learning,
  title     = {{Learning with Pseudo-Ensembles}},
  author    = {Bachman, Philip and Alsharif, Ouais and Precup, Doina},
  booktitle = {Neural Information Processing Systems},
  year      = {2014},
  pages     = {3365-3373},
  url       = {https://mlanthology.org/neurips/2014/bachman2014neurips-learning/}
}