Rethinking Momentum Knowledge Distillation in Online Continual Learning

Abstract

Online Continual Learning (OCL) addresses the problem of training neural networks on a continuous data stream where multiple classification tasks emerge in sequence. In contrast to offline Continual Learning, data can be seen only once in OCL, which is a very severe constraint. In this context, replay-based strategies have achieved impressive results and most state-of-the-art approaches heavily depend on them. While Knowledge Distillation (KD) has been extensively used in offline Continual Learning, it remains under-exploited in OCL, despite its high potential. In this paper, we analyze the challenges in applying KD to OCL and give empirical justifications. We introduce a direct yet effective methodology for applying Momentum Knowledge Distillation (MKD) to many flagship OCL methods and demonstrate its capabilities to enhance existing approaches. In addition to improving existing state-of-the-art accuracy by more than $10%$ points on ImageNet100, we shed light on MKD internal mechanics and impacts during training in OCL. We argue that similar to replay, MKD should be considered a central component of OCL. The code is available at https://github.com/Nicolas1203/mkd_ocl.

Cite

Text

Michel et al. "Rethinking Momentum Knowledge Distillation in Online Continual Learning." International Conference on Machine Learning, 2024.

Markdown

[Michel et al. "Rethinking Momentum Knowledge Distillation in Online Continual Learning." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/michel2024icml-rethinking/)

BibTeX

@inproceedings{michel2024icml-rethinking,
  title     = {{Rethinking Momentum Knowledge Distillation in Online Continual Learning}},
  author    = {Michel, Nicolas and Wang, Maorong and Xiao, Ling and Yamasaki, Toshihiko},
  booktitle = {International Conference on Machine Learning},
  year      = {2024},
  pages     = {35607-35622},
  volume    = {235},
  url       = {https://mlanthology.org/icml/2024/michel2024icml-rethinking/}
}