Learning to Augment for Data-Scarce Domain BERT Knowledge Distillation
Abstract
Despite pre-trained language models such as BERT have achieved appealing performance in a wide range of Natural Language Processing (NLP) tasks, they are computationally expensive to be deployed in real-time applications. A typical method is to adopt knowledge distillation to compress these large pre-trained models (teacher models) to small student models. However, for a target domain with scarce training data, the teacher can hardly pass useful knowledge to the student, which yields performance degradation for the student models. To tackle this problem, we propose a method to learn to augment data for BERT Knowledge Distillation in target domains with scarce labeled data, by learning a cross-domain manipulation scheme that automatically augments the target domain with the help of resource-rich source domains. Specifically, the proposed method generates samples acquired from a stationary distribution near the target data and adopts a reinforced controller to automatically refine the augmentation strategy according to the performance of the student. Extensive experiments demonstrate that the proposed method significantly outperforms state-of-the-art baselines on different NLP tasks, and for the data-scarce domains, the compressed student models even perform better than the original large teacher model, with much fewer parameters (only ~13.3%) when only a few labeled examples available.
Cite
Text
Feng et al. "Learning to Augment for Data-Scarce Domain BERT Knowledge Distillation." AAAI Conference on Artificial Intelligence, 2021. doi:10.1609/AAAI.V35I8.16910Markdown
[Feng et al. "Learning to Augment for Data-Scarce Domain BERT Knowledge Distillation." AAAI Conference on Artificial Intelligence, 2021.](https://mlanthology.org/aaai/2021/feng2021aaai-learning/) doi:10.1609/AAAI.V35I8.16910BibTeX
@inproceedings{feng2021aaai-learning,
title = {{Learning to Augment for Data-Scarce Domain BERT Knowledge Distillation}},
author = {Feng, Lingyun and Qiu, Minghui and Li, Yaliang and Zheng, Hai-Tao and Shen, Ying},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2021},
pages = {7422-7430},
doi = {10.1609/AAAI.V35I8.16910},
url = {https://mlanthology.org/aaai/2021/feng2021aaai-learning/}
}