ISAAC Newton: Input-Based Approximate Curvature for Newton's Method

Abstract

We present ISAAC (Input-baSed ApproximAte Curvature), a novel method that conditions the gradient using selected second-order information and has an asymptotically vanishing computational overhead, assuming a batch size smaller than the number of neurons. We show that it is possible to compute a good conditioner based on only the input to a respective layer without a substantial computational overhead. The proposed method allows effective training even in small-batch stochastic regimes, which makes it competitive to first-order as well as second-order methods.

Cite

Text

Petersen et al. "ISAAC Newton: Input-Based Approximate Curvature for Newton's Method." International Conference on Learning Representations, 2023.

Markdown

[Petersen et al. "ISAAC Newton: Input-Based Approximate Curvature for Newton's Method." International Conference on Learning Representations, 2023.](https://mlanthology.org/iclr/2023/petersen2023iclr-isaac/)

BibTeX

@inproceedings{petersen2023iclr-isaac,
  title     = {{ISAAC Newton: Input-Based Approximate Curvature for Newton's Method}},
  author    = {Petersen, Felix and Sutter, Tobias and Borgelt, Christian and Huh, Dongsung and Kuehne, Hilde and Sun, Yuekai and Deussen, Oliver},
  booktitle = {International Conference on Learning Representations},
  year      = {2023},
  url       = {https://mlanthology.org/iclr/2023/petersen2023iclr-isaac/}
}