Fast Convergence of Random Reshuffling Under Over-Parameterization and the Polyak-Łojasiewicz Condition

Abstract

Modern machine learning models are often over-parameterized and as a result they can interpolate the training data. Under such a scenario, we study the convergence properties of a sampling-without-replacement variant of stochastic gradient descent (SGD) known as random reshuffling (RR). Unlike SGD that samples data with replacement at every iteration, RR chooses a random permutation of data at the beginning of each epoch and each iteration chooses the next sample from the permutation. For under-parameterized models, it has been shown RR can converge faster than SGD under certain assumptions. However, previous works do not show that RR outperforms SGD in over-parameterized settings except in some highly-restrictive scenarios. For the class of Polyak-Łojasiewicz (PL) functions, we show that RR can outperform SGD in over-parameterized settings when either one of the following holds: (i) the number of samples ( n ) is less than the product of the condition number ( $\kappa $ κ ) and the parameter ( $\alpha $ α ) of a weak growth condition (WGC), or (ii) n is less than the parameter ( $\rho $ ρ ) of a strong growth condition (SGC).

Cite

Text

Fan et al. "Fast Convergence of Random Reshuffling Under Over-Parameterization and the Polyak-Łojasiewicz Condition." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2023. doi:10.1007/978-3-031-43421-1_18

Markdown

[Fan et al. "Fast Convergence of Random Reshuffling Under Over-Parameterization and the Polyak-Łojasiewicz Condition." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2023.](https://mlanthology.org/ecmlpkdd/2023/fan2023ecmlpkdd-fast/) doi:10.1007/978-3-031-43421-1_18

BibTeX

@inproceedings{fan2023ecmlpkdd-fast,
  title     = {{Fast Convergence of Random Reshuffling Under Over-Parameterization and the Polyak-Łojasiewicz Condition}},
  author    = {Fan, Chen and Thrampoulidis, Christos and Schmidt, Mark},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2023},
  pages     = {301-315},
  doi       = {10.1007/978-3-031-43421-1_18},
  url       = {https://mlanthology.org/ecmlpkdd/2023/fan2023ecmlpkdd-fast/}
}