DINGO: Distributed Newton-Type Method for Gradient-Norm Optimization

Abstract

For optimization of a large sum of functions in a distributed computing environment, we present a novel communication efficient Newton-type algorithm that enjoys a variety of advantages over similar existing methods. Our algorithm, DINGO, is derived by optimization of the gradient's norm as a surrogate function. DINGO does not impose any specific form on the underlying functions and its application range extends far beyond convexity and smoothness. The underlying sub-problems of DINGO are simple linear least-squares, for which a plethora of efficient algorithms exist. DINGO involves a few hyper-parameters that are easy to tune and we theoretically show that a strict reduction in the surrogate objective is guaranteed, regardless of the selected hyper-parameters.

Cite

Text

Crane and Roosta. "DINGO: Distributed Newton-Type Method for Gradient-Norm Optimization." Neural Information Processing Systems, 2019.

Markdown

[Crane and Roosta. "DINGO: Distributed Newton-Type Method for Gradient-Norm Optimization." Neural Information Processing Systems, 2019.](https://mlanthology.org/neurips/2019/crane2019neurips-dingo/)

BibTeX

@inproceedings{crane2019neurips-dingo,
  title     = {{DINGO: Distributed Newton-Type Method for Gradient-Norm Optimization}},
  author    = {Crane, Rixon and Roosta, Fred},
  booktitle = {Neural Information Processing Systems},
  year      = {2019},
  pages     = {9498-9508},
  url       = {https://mlanthology.org/neurips/2019/crane2019neurips-dingo/}
}