Sequential Learning of Neural Networks for Prequential MDL
Abstract
Minimum Description Length (MDL) provides a framework and an objective for principled model evaluation. It formalizes Occam's Razor and can be applied to data from non-stationary sources. In the prequential formulation of MDL, the objective is to minimize the cumulative next-step log-loss when sequentially going through the data and using previous observations for parameter estimation. It thus closely resembles a continual- or online-learning problem. In this study, we evaluate approaches for computing prequential description lengths for image classification datasets with neural networks. Considering the computational cost, we find that online-learning with rehearsal has favorable performance compared to the previously widely used block-wise estimation. We propose forward-calibration to better align the models predictions with the empirical observations and introduce replay-streams, a minibatch incremental training technique to efficiently implement approximate random replay while avoiding large in-memory replay buffers. As a result, we present description lengths for a suite of image classification datasets that improve upon previously reported results by large margins.
Cite
Text
Bornschein et al. "Sequential Learning of Neural Networks for Prequential MDL." International Conference on Learning Representations, 2023.Markdown
[Bornschein et al. "Sequential Learning of Neural Networks for Prequential MDL." International Conference on Learning Representations, 2023.](https://mlanthology.org/iclr/2023/bornschein2023iclr-sequential/)BibTeX
@inproceedings{bornschein2023iclr-sequential,
title = {{Sequential Learning of Neural Networks for Prequential MDL}},
author = {Bornschein, Jorg and Li, Yazhe and Hutter, Marcus},
booktitle = {International Conference on Learning Representations},
year = {2023},
url = {https://mlanthology.org/iclr/2023/bornschein2023iclr-sequential/}
}