EigenNet: Towards Fast and Structural Learning of Deep Neural Networks
Abstract
Deep Neural Network (DNN) is difficult to train and easy to overfit in training. We address these two issues by introducing EigenNet, an architecture that not only accelerates training but also adjusts number of hidden neurons to reduce over-fitting. They are achieved by whitening the information flows of DNNs and removing those eigenvectors that may capture noises. The former improves conditioning of the Fisher information matrix, whilst the latter increases generalization capability. These appealing properties of EigenNet can benefit many recent DNN structures, such as network in network and inception, by wrapping their hidden layers into the layers of EigenNet. The modeling capacities of the original networks are preserved. Both the training wall-clock time and number of updates are reduced by using EigenNet, compared to stochastic gradient descent on various datasets, including MNIST, CIFAR-10, and CIFAR-100.
Cite
Text
Luo. "EigenNet: Towards Fast and Structural Learning of Deep Neural Networks." International Joint Conference on Artificial Intelligence, 2017. doi:10.24963/IJCAI.2017/338Markdown
[Luo. "EigenNet: Towards Fast and Structural Learning of Deep Neural Networks." International Joint Conference on Artificial Intelligence, 2017.](https://mlanthology.org/ijcai/2017/luo2017ijcai-eigennet/) doi:10.24963/IJCAI.2017/338BibTeX
@inproceedings{luo2017ijcai-eigennet,
title = {{EigenNet: Towards Fast and Structural Learning of Deep Neural Networks}},
author = {Luo, Ping},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2017},
pages = {2428-2434},
doi = {10.24963/IJCAI.2017/338},
url = {https://mlanthology.org/ijcai/2017/luo2017ijcai-eigennet/}
}