DINGO: Distributed Newton-Type Method for Gradient-Norm Optimization
Abstract
For optimization of a large sum of functions in a distributed computing environment, we present a novel communication efficient Newton-type algorithm that enjoys a variety of advantages over similar existing methods. Our algorithm, DINGO, is derived by optimization of the gradient's norm as a surrogate function. DINGO does not impose any specific form on the underlying functions and its application range extends far beyond convexity and smoothness. The underlying sub-problems of DINGO are simple linear least-squares, for which a plethora of efficient algorithms exist. DINGO involves a few hyper-parameters that are easy to tune and we theoretically show that a strict reduction in the surrogate objective is guaranteed, regardless of the selected hyper-parameters.
Cite
Text
Crane and Roosta. "DINGO: Distributed Newton-Type Method for Gradient-Norm Optimization." Neural Information Processing Systems, 2019.Markdown
[Crane and Roosta. "DINGO: Distributed Newton-Type Method for Gradient-Norm Optimization." Neural Information Processing Systems, 2019.](https://mlanthology.org/neurips/2019/crane2019neurips-dingo/)BibTeX
@inproceedings{crane2019neurips-dingo,
title = {{DINGO: Distributed Newton-Type Method for Gradient-Norm Optimization}},
author = {Crane, Rixon and Roosta, Fred},
booktitle = {Neural Information Processing Systems},
year = {2019},
pages = {9498-9508},
url = {https://mlanthology.org/neurips/2019/crane2019neurips-dingo/}
}