An Adaptive Polyak Heavy-Ball Method

Abstract

The heavy-ball (HB) method has become a well-known practice for large-scale machine learning problems, and it can achieve the fastest local convergence rate when objective functions are smooth and strongly convex using Polyak’s optimal hyper-parameters. However, such convergence rates are based on specific uncertain and time-invariant hyper-parameters that limit its potential. In this paper, we propose an adaptive HB that estimates the Polyak’s optimal hyper-parameters at each iteration. Our adaptive approach employs the absolute differences of current and previous model parameters and their gradients. Such representation allows for a computationally efficient optimizer. We show that our method guarantees a global linear convergence rate for smooth and strongly convex objective functions. Whereas in the stochastic setting, we show that proposed stochastic algorithm converges almost surely for non-convex smooth functions with bounded gradient. We validate the effectiveness of our method on image classification datasets with no empirical tuning, and its superiority on quadratic and non-convex functions while comparing its performance to the state-of-the-art optimizers.

Cite

Text

Jr. et al. "An Adaptive Polyak Heavy-Ball Method." Machine Learning, 2022. doi:10.1007/S10994-022-06215-7

Markdown

[Jr. et al. "An Adaptive Polyak Heavy-Ball Method." Machine Learning, 2022.](https://mlanthology.org/mlj/2022/jr2022mlj-adaptive/) doi:10.1007/S10994-022-06215-7

BibTeX

@article{jr2022mlj-adaptive,
  title     = {{An Adaptive Polyak Heavy-Ball Method}},
  author    = {Jr., Samer Saab and Phoha, Shashi and Zhu, Minghui and Ray, Asok},
  journal   = {Machine Learning},
  year      = {2022},
  pages     = {3245-3277},
  doi       = {10.1007/S10994-022-06215-7},
  volume    = {111},
  url       = {https://mlanthology.org/mlj/2022/jr2022mlj-adaptive/}
}