Casper : Cascading Hypernetworks for Scalable Continual Learning
Abstract
Continual learning, the ability for a model to learn tasks sequentially without forgetting, remains a formidable challenge in deep learning. This paper introduces a novel approach, termed cascading hypernetworks, that combines the power of hypernetworks to generate the weights for multiple neural networks. To address the limited scalability of previous continual learning algorithms and accommodate an exponentially growing number of tasks, we propose a cascading architecture in which hypernetworks learn the weights of other hypernetworks. Additionally, with auto-generative replay, the hypernetwork generates samples of previous networks, mitigating forgetting without the need for an expanding memory buffer. Our findings highlight the promise of cascading hypernetworks in addressing the scalability and forgetting challenges inherent in continual learning, by evaluating their effectiveness on both reinforcement learning tasks and image classification benchmarks.
Cite
Text
Pandit and Kudithipudi. "Casper : Cascading Hypernetworks for Scalable Continual Learning." NeurIPS 2024 Workshops: Continual_FoMo, 2024.Markdown
[Pandit and Kudithipudi. "Casper : Cascading Hypernetworks for Scalable Continual Learning." NeurIPS 2024 Workshops: Continual_FoMo, 2024.](https://mlanthology.org/neuripsw/2024/pandit2024neuripsw-casper/)BibTeX
@inproceedings{pandit2024neuripsw-casper,
title = {{Casper : Cascading Hypernetworks for Scalable Continual Learning}},
author = {Pandit, Tej and Kudithipudi, Dhireesha},
booktitle = {NeurIPS 2024 Workshops: Continual_FoMo},
year = {2024},
url = {https://mlanthology.org/neuripsw/2024/pandit2024neuripsw-casper/}
}