Dirichlet Continual Learning: Tackling Catastrophic Forgetting in NLP
Abstract
Catastrophic forgetting poses a significant challenge in continual learning (CL). In the context of Natural Language Processing, generative-based rehearsal CL methods have made progress in avoiding expensive retraining. However, generating pseudo samples that accurately capture the task-specific distribution remains a daunting task. In this paper, we propose Dirichlet Continual Learning (DCL), a novel generative-based rehearsal strategy designed specifically for CL. Different from the conventional use of Gaussian latent variable in Conditional Variational Autoencoder, DCL employs the flexibility of the Dirichlet distribution to model the latent variable. This allows DCL to effectively capture sentence-level features from previous tasks and guide the generation of pseudo samples. Additionally, we introduce Jensen-Shannon Knowledge Distillation, a robust logit-based knowledge distillation method that enhances knowledge transfer during pseudo-sample generation. Our extensive experiments show that DCL outperforms state-of-the-art methods in two typical tasks of task-oriented dialogue systems, demonstrating its efficacy.
Cite
Text
Zeng et al. "Dirichlet Continual Learning: Tackling Catastrophic Forgetting in NLP." Uncertainty in Artificial Intelligence, 2024.Markdown
[Zeng et al. "Dirichlet Continual Learning: Tackling Catastrophic Forgetting in NLP." Uncertainty in Artificial Intelligence, 2024.](https://mlanthology.org/uai/2024/zeng2024uai-dirichlet/)BibTeX
@inproceedings{zeng2024uai-dirichlet,
title = {{Dirichlet Continual Learning: Tackling Catastrophic Forgetting in NLP}},
author = {Zeng, Min and Yang, Haiqin and Xue, Wei and Liu, Qifeng and Guo, Yike},
booktitle = {Uncertainty in Artificial Intelligence},
year = {2024},
pages = {4096-4108},
volume = {244},
url = {https://mlanthology.org/uai/2024/zeng2024uai-dirichlet/}
}