Weight-Balancing Fixes and Flows for Deep Learning

Abstract

Feedforward neural networks with homogeneous activation functions possess an internal symmetry: the functions they compute do not change when the incoming and outgoing weights at any hidden unit are rescaled by reciprocal positive values. This paper makes two contributions to our understanding of these networks. The first is to describe a simple procedure, or {\it fix}, for balancing the weights in these networks: this procedure computes multiplicative rescaling factors---one at each hidden unit---that rebalance the weights of these networks without changing the end-to-end functions that they compute. Specifically, given an initial network with arbitrary weights, the procedure determines the functionally equivalent network whose weight matrix is of minimal $\ell_{p,q}$-norm; the weights at each hidden unit are said to be balanced when this norm is stationary with respect to rescaling transformations. The optimal rescaling factors are computed in an iterative fashion via simple multiplicative updates, and the updates are notable in that (a) they do not require the tuning of learning rates, (b) they operate in parallel on the rescaling factors at all hidden units, and (c) they converge monotonically to a global minimizer of the $\ell_{p,q}$-norm. The paper's second contribution is to analyze the optimization landscape for learning in these networks. We suppose that the network's loss function consists of two terms---one that is invariant to rescaling transformations, measuring predictive accuracy, and another (a regularizer) that breaks this invariance, penalizing large weights. We show how to derive a weight-balancing {\it flow} such that the regularizer remains minimal with respect to rescaling transformations as the weights descend in the loss function. These dynamics reduce to an ordinary gradient flow for $\ell_2$-norm regularization, but not otherwise. In this way our analysis suggests a canonical pairing of alternative flows and regularizers.

Cite

Text

Saul. "Weight-Balancing Fixes and Flows for Deep Learning." Transactions on Machine Learning Research, 2023.

Markdown

[Saul. "Weight-Balancing Fixes and Flows for Deep Learning." Transactions on Machine Learning Research, 2023.](https://mlanthology.org/tmlr/2023/saul2023tmlr-weightbalancing/)

BibTeX

@article{saul2023tmlr-weightbalancing,
  title     = {{Weight-Balancing Fixes and Flows for Deep Learning}},
  author    = {Saul, Lawrence K.},
  journal   = {Transactions on Machine Learning Research},
  year      = {2023},
  url       = {https://mlanthology.org/tmlr/2023/saul2023tmlr-weightbalancing/}
}