An Analysis on Negative Curvature Induced by Singularity in Multi-Layer Neural-Network Learning

Abstract

In the neural-network parameter space, an attractive field is likely to be induced by singularities. In such a singularity region, first-order gradient learning typically causes a long plateau with very little change in the objective function value E (hence, a flat region). Therefore, it may be confused with ``attractive'' local minima. Our analysis shows that the Hessian matrix of E tends to be indefinite in the vicinity of (perturbed) singular points, suggesting a promising strategy that exploits negative curvature so as to escape from the singularity plateaus. For numerical evidence, we limit the scope to small examples (some of which are found in journal papers) that allow us to confirm singularities and the eigenvalues of the Hessian matrix, and for which computation using a descent direction of negative curvature encounters no plateau. Even for those small problems, no efficient methods have been previously developed that avoided plateaus.

Cite

Text

Mizutani and Dreyfus. "An Analysis on Negative Curvature Induced by Singularity in Multi-Layer Neural-Network Learning." Neural Information Processing Systems, 2010.

Markdown

[Mizutani and Dreyfus. "An Analysis on Negative Curvature Induced by Singularity in Multi-Layer Neural-Network Learning." Neural Information Processing Systems, 2010.](https://mlanthology.org/neurips/2010/mizutani2010neurips-analysis/)

BibTeX

@inproceedings{mizutani2010neurips-analysis,
  title     = {{An Analysis on Negative Curvature Induced by Singularity in Multi-Layer Neural-Network Learning}},
  author    = {Mizutani, Eiji and Dreyfus, Stuart},
  booktitle = {Neural Information Processing Systems},
  year      = {2010},
  pages     = {1669-1677},
  url       = {https://mlanthology.org/neurips/2010/mizutani2010neurips-analysis/}
}