Oracle Complexity of Second-Order Methods for Finite-Sum Problems

Abstract

Finite-sum optimization problems are ubiquitous in machine learning, and are commonly solved using first-order methods which rely on gradient computations. Recently, there has been growing interest in second-order methods, which rely on both gradients and Hessians. In principle, second-order methods can require much fewer iterations than first-order methods, and hold the promise for more efficient algorithms. Although computing and manipulating Hessians is prohibitive for high-dimensional problems in general, the Hessians of individual functions in finite-sum problems can often be efficiently computed, e.g. because they possess a low-rank structure. Can second-order information indeed be used to solve such problems more efficiently? In this paper, we provide evidence that the answer – perhaps surprisingly – is negative, at least in terms of worst-case guarantees. However, we also discuss what additional assumptions and algorithmic approaches might potentially circumvent this negative result.

Cite

Text

Arjevani and Shamir. "Oracle Complexity of Second-Order Methods for Finite-Sum Problems." International Conference on Machine Learning, 2017.

Markdown

[Arjevani and Shamir. "Oracle Complexity of Second-Order Methods for Finite-Sum Problems." International Conference on Machine Learning, 2017.](https://mlanthology.org/icml/2017/arjevani2017icml-oracle/)

BibTeX

@inproceedings{arjevani2017icml-oracle,
  title     = {{Oracle Complexity of Second-Order Methods for Finite-Sum Problems}},
  author    = {Arjevani, Yossi and Shamir, Ohad},
  booktitle = {International Conference on Machine Learning},
  year      = {2017},
  pages     = {205-213},
  volume    = {70},
  url       = {https://mlanthology.org/icml/2017/arjevani2017icml-oracle/}
}