Learning Deep ResNet Blocks Sequentially Using Boosting Theory

Abstract

We prove a multi-channel telescoping sum boosting theory for the ResNet architectures which simultaneously creates a new technique for boosting over features (in contrast with labels) and provides a new algorithm for ResNet-style architectures. Our proposed training algorithm, BoostResNet, is particularly suitable in non-differentiable architectures. Our method only requires the relatively inexpensive sequential training of $T$ “shallow ResNets”. We prove that the training error decays exponentially with the depth $T$ if the weak module classifiers that we train perform slightly better than some weak baseline. In other words, we propose a weak learning condition and prove a boosting theory for ResNet under the weak learning condition. A generalization error bound based on margin theory is proved and suggests that ResNet could be resistant to overfitting using a network with $l_1$ norm bounded weights.

Cite

Text

Huang et al. "Learning Deep ResNet Blocks Sequentially Using Boosting Theory." International Conference on Machine Learning, 2018.

Markdown

[Huang et al. "Learning Deep ResNet Blocks Sequentially Using Boosting Theory." International Conference on Machine Learning, 2018.](https://mlanthology.org/icml/2018/huang2018icml-learning-a/)

BibTeX

@inproceedings{huang2018icml-learning-a,
  title     = {{Learning Deep ResNet Blocks Sequentially Using Boosting Theory}},
  author    = {Huang, Furong and Ash, Jordan and Langford, John and Schapire, Robert},
  booktitle = {International Conference on Machine Learning},
  year      = {2018},
  pages     = {2058-2067},
  volume    = {80},
  url       = {https://mlanthology.org/icml/2018/huang2018icml-learning-a/}
}