Reparameterization Through Spatial Gradient Scaling
Abstract
Reparameterization aims to improve the generalization of deep neural networks by transforming a convolution operation into equivalent multi-branched structures during training. However, there exists a gap in understanding how reparameterization may change and benefit learning processes for neural networks. In this paper, we present a novel spatial gradient scaling method to redistribute learning focus among weights in convolutional neural networks. We prove that spatial gradient scaling achieves the same learning dynamics as a branched reparameterization yet without introducing structural changes into the network. We further propose an analytical approach that dynamically learns scalings for each convolutional layer based on the spatial characteristics of its input feature map gauged by mutual information. Experiments on CIFAR-10, CIFAR-100, and ImageNet show that without searching for reparameterized structures, our proposed scaling method outperforms the state-of-the-art reparameterization methods at a lower computational cost.
Cite
Text
Detkov et al. "Reparameterization Through Spatial Gradient Scaling." International Conference on Learning Representations, 2023.Markdown
[Detkov et al. "Reparameterization Through Spatial Gradient Scaling." International Conference on Learning Representations, 2023.](https://mlanthology.org/iclr/2023/detkov2023iclr-reparameterization/)BibTeX
@inproceedings{detkov2023iclr-reparameterization,
title = {{Reparameterization Through Spatial Gradient Scaling}},
author = {Detkov, Alexander and Salameh, Mohammad and Fetrat, Muhammad and Zhang, Jialin and Luwei, Robin and Jui, Shangling and Niu, Di},
booktitle = {International Conference on Learning Representations},
year = {2023},
url = {https://mlanthology.org/iclr/2023/detkov2023iclr-reparameterization/}
}