Dimension-Free Bounds for Low-Precision Training
Abstract
Low-precision training is a promising way of decreasing the time and energy cost of training machine learning models. Previous work has analyzed low-precision training algorithms, such as low-precision stochastic gradient descent, and derived theoretical bounds on their convergence rates. These bounds tend to depend on the dimension of the model $d$ in that the number of bits needed to achieve a particular error bound increases as $d$ increases. In this paper, we derive new bounds for low-precision training algorithms that do not contain the dimension $d$ , which lets us better understand what affects the convergence of these algorithms as parameters scale. Our methods also generalize naturally to let us prove new convergence bounds on low-precision training with other quantization schemes, such as low-precision floating-point computation and logarithmic quantization.
Cite
Text
Li and De Sa. "Dimension-Free Bounds for Low-Precision Training." Neural Information Processing Systems, 2019.Markdown
[Li and De Sa. "Dimension-Free Bounds for Low-Precision Training." Neural Information Processing Systems, 2019.](https://mlanthology.org/neurips/2019/li2019neurips-dimensionfree/)BibTeX
@inproceedings{li2019neurips-dimensionfree,
title = {{Dimension-Free Bounds for Low-Precision Training}},
author = {Li, Zheng and De Sa, Christopher M},
booktitle = {Neural Information Processing Systems},
year = {2019},
pages = {11733-11761},
url = {https://mlanthology.org/neurips/2019/li2019neurips-dimensionfree/}
}