EigenNet: Towards Fast and Structural Learning of Deep Neural Networks

Abstract

Deep Neural Network (DNN) is difficult to train and easy to overfit in training. We address these two issues by introducing EigenNet, an architecture that not only accelerates training but also adjusts number of hidden neurons to reduce over-fitting. They are achieved by whitening the information flows of DNNs and removing those eigenvectors that may capture noises. The former improves conditioning of the Fisher information matrix, whilst the latter increases generalization capability. These appealing properties of EigenNet can benefit many recent DNN structures, such as network in network and inception, by wrapping their hidden layers into the layers of EigenNet. The modeling capacities of the original networks are preserved. Both the training wall-clock time and number of updates are reduced by using EigenNet, compared to stochastic gradient descent on various datasets, including MNIST, CIFAR-10, and CIFAR-100.

Cite

Text

Luo. "EigenNet: Towards Fast and Structural Learning of Deep Neural Networks." International Joint Conference on Artificial Intelligence, 2017. doi:10.24963/IJCAI.2017/338

Markdown

[Luo. "EigenNet: Towards Fast and Structural Learning of Deep Neural Networks." International Joint Conference on Artificial Intelligence, 2017.](https://mlanthology.org/ijcai/2017/luo2017ijcai-eigennet/) doi:10.24963/IJCAI.2017/338

BibTeX

@inproceedings{luo2017ijcai-eigennet,
  title     = {{EigenNet: Towards Fast and Structural Learning of Deep Neural Networks}},
  author    = {Luo, Ping},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2017},
  pages     = {2428-2434},
  doi       = {10.24963/IJCAI.2017/338},
  url       = {https://mlanthology.org/ijcai/2017/luo2017ijcai-eigennet/}
}