Deconstructing the Ladder Network Architecture
Abstract
The Ladder Network is a recent new approach to semi-supervised learning that turned out to be very successful. While showing impressive performance, the Ladder Network has many components intertwined, whose contributions are not obvious in such a complex architecture. This paper presents an extensive experimental investigation of variants of the Ladder Network in which we replaced or removed individual components to learn about their relative importance. For semi-supervised tasks, we conclude that the most important contribution is made by the lateral connections, followed by the application of noise, and the choice of what we refer to as the ‘combinator function’. As the number of labeled training examples increases, the lateral connections and the reconstruction criterion become less important, with most of the generalization improvement coming from the injection of noise in each layer. Finally, we introduce a combinator function that reduces test error rates on Permutation-Invariant MNIST to 0.57% for the supervised setting, and to 0.97% and 1.0% for semi-supervised settings with 1000 and 100 labeled examples, respectively.
Cite
Text
Pezeshki et al. "Deconstructing the Ladder Network Architecture." International Conference on Machine Learning, 2016.Markdown
[Pezeshki et al. "Deconstructing the Ladder Network Architecture." International Conference on Machine Learning, 2016.](https://mlanthology.org/icml/2016/pezeshki2016icml-deconstructing/)BibTeX
@inproceedings{pezeshki2016icml-deconstructing,
title = {{Deconstructing the Ladder Network Architecture}},
author = {Pezeshki, Mohammad and Fan, Linxi and Brakel, Philemon and Courville, Aaron and Bengio, Yoshua},
booktitle = {International Conference on Machine Learning},
year = {2016},
pages = {2368-2376},
volume = {48},
url = {https://mlanthology.org/icml/2016/pezeshki2016icml-deconstructing/}
}