Multiple Modes for Continual Learning
Abstract
Adapting model parameters to incoming streams of data is a crucial factor to deep learning scalability. Interestingly, prior continual learning strategies in online settings inadvertently anchor their updated parameters to a local parameter subspace to remember old tasks, else drift away from the subspace and forget. From this observation, we formulate a trade-off between constructing multiple parameter modes and allocating tasks per mode. Mode-Optimized Task Allocation (MOTA), our contributed adaptation strategy, trains multiple modes in parallel, then optimizes task allocation per mode. We empirically demonstrate improvements over baseline continual learning strategies and across varying distribution shifts, namely sub10 population, domain, and task shift.
Cite
Text
Datta and Shadbolt. "Multiple Modes for Continual Learning." NeurIPS 2022 Workshops: MetaLearn, 2022.Markdown
[Datta and Shadbolt. "Multiple Modes for Continual Learning." NeurIPS 2022 Workshops: MetaLearn, 2022.](https://mlanthology.org/neuripsw/2022/datta2022neuripsw-multiple-a/)BibTeX
@inproceedings{datta2022neuripsw-multiple-a,
title = {{Multiple Modes for Continual Learning}},
author = {Datta, Siddhartha and Shadbolt, Nigel},
booktitle = {NeurIPS 2022 Workshops: MetaLearn},
year = {2022},
url = {https://mlanthology.org/neuripsw/2022/datta2022neuripsw-multiple-a/}
}