Scalable and Order-Robust Continual Learning with Additive Parameter Decomposition

Abstract

While recent continual learning methods largely alleviate the catastrophic problem on toy-sized datasets, there are issues that remain to be tackled in order to apply them to real-world problem domains. First, a continual learning model should effectively handle catastrophic forgetting and be efficient to train even with a large number of tasks. Secondly, it needs to tackle the problem of order-sensitivity, where the performance of the tasks largely varies based on the order of the task arrival sequence, as it may cause serious problems where fairness plays a critical role (e.g. medical diagnosis). To tackle these practical challenges, we propose a novel continual learning method that is scalable as well as order-robust, which instead of learning a completely shared set of weights, represents the parameters for each task as a sum of task-shared and sparse task-adaptive parameters. With our Additive Parameter Decomposition (APD), the task-adaptive parameters for earlier tasks remain mostly unaffected, where we update them only to reflect the changes made to the task-shared parameters. This decomposition of parameters effectively prevents catastrophic forgetting and order-sensitivity, while being computation- and memory-efficient. Further, we can achieve even better scalability with APD using hierarchical knowledge consolidation, which clusters the task-adaptive parameters to obtain hierarchically shared parameters. We validate our network with APD, APD-Net, on multiple benchmark datasets against state-of-the-art continual learning methods, which it largely outperforms in accuracy, scalability, and order-robustness.

Cite

Text

Yoon et al. "Scalable and Order-Robust Continual Learning with Additive Parameter Decomposition." International Conference on Learning Representations, 2020.

Markdown

[Yoon et al. "Scalable and Order-Robust Continual Learning with Additive Parameter Decomposition." International Conference on Learning Representations, 2020.](https://mlanthology.org/iclr/2020/yoon2020iclr-scalable/)

BibTeX

@inproceedings{yoon2020iclr-scalable,
  title     = {{Scalable and Order-Robust Continual Learning with Additive Parameter Decomposition}},
  author    = {Yoon, Jaehong and Kim, Saehoon and Yang, Eunho and Hwang, Sung Ju},
  booktitle = {International Conference on Learning Representations},
  year      = {2020},
  url       = {https://mlanthology.org/iclr/2020/yoon2020iclr-scalable/}
}