Deep Neural Networks for High Dimension, Low Sample Size Data

Abstract

Deep neural networks (DNN) have achieved breakthroughs in applications with large sample size. However, when facing high dimension, low sample size (HDLSS) data, such as the phenotype prediction problem using genetic data in bioinformatics, DNN suffers from overfitting and high-variance gradients. In this paper, we propose a DNN model tailored for the HDLSS data, named Deep Neural Pursuit (DNP). DNP selects a subset of high dimensional features for the alleviation of overfitting and takes the average over multiple dropouts to calculate gradients with low variance. As the first DNN method applied on the HDLSS data, DNP enjoys the advantages of the high nonlinearity, the robustness to high dimensionality, the capability of learning from a small number of samples, the stability in feature selection, and the end-to-end training. We demonstrate these advantages of DNP via empirical results on both synthetic and real-world biological datasets.

Cite

Text

Liu et al. "Deep Neural Networks for High Dimension, Low Sample Size Data." International Joint Conference on Artificial Intelligence, 2017. doi:10.24963/IJCAI.2017/318

Markdown

[Liu et al. "Deep Neural Networks for High Dimension, Low Sample Size Data." International Joint Conference on Artificial Intelligence, 2017.](https://mlanthology.org/ijcai/2017/liu2017ijcai-deep-a/) doi:10.24963/IJCAI.2017/318

BibTeX

@inproceedings{liu2017ijcai-deep-a,
  title     = {{Deep Neural Networks for High Dimension, Low Sample Size Data}},
  author    = {Liu, Bo and Wei, Ying and Zhang, Yu and Yang, Qiang},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2017},
  pages     = {2287-2293},
  doi       = {10.24963/IJCAI.2017/318},
  url       = {https://mlanthology.org/ijcai/2017/liu2017ijcai-deep-a/}
}