LRC-BERT: Latent-Representation Contrastive Knowledge Distillation for Natural Language Understanding
Abstract
The pre-training models such as BERT have achieved great results in various natural language processing problems. However, a large number of parameters need significant amounts of memory and the consumption of inference time, which makes it difficult to deploy them on edge devices. In this work, we propose a knowledge distillation method LRC-BERT based on contrastive learning to fit the output of the intermediate layer from the angular distance aspect, which is not considered by the existing distillation methods. Furthermore, we introduce a gradient perturbation-based training architecture in the training phase to increase the robustness of LRC-BERT, which is the first attempt in knowledge distillation. Additionally, in order to better capture the distribution characteristics of the intermediate layer, we design a two-stage training method for the total distillation loss. Finally, by verifying 8 datasets on the General Language Understanding Evaluation (GLUE) benchmark, the performance of the proposed LRC-BERT exceeds the existing state-of-the-art methods, which proves the effectiveness of our method.
Cite
Text
Fu et al. "LRC-BERT: Latent-Representation Contrastive Knowledge Distillation for Natural Language Understanding." AAAI Conference on Artificial Intelligence, 2021. doi:10.1609/AAAI.V35I14.17518Markdown
[Fu et al. "LRC-BERT: Latent-Representation Contrastive Knowledge Distillation for Natural Language Understanding." AAAI Conference on Artificial Intelligence, 2021.](https://mlanthology.org/aaai/2021/fu2021aaai-lrc/) doi:10.1609/AAAI.V35I14.17518BibTeX
@inproceedings{fu2021aaai-lrc,
title = {{LRC-BERT: Latent-Representation Contrastive Knowledge Distillation for Natural Language Understanding}},
author = {Fu, Hao and Zhou, Shaojun and Yang, Qihong and Tang, Junjie and Liu, Guiquan and Liu, Kaikui and Li, Xiaolong},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2021},
pages = {12830-12838},
doi = {10.1609/AAAI.V35I14.17518},
url = {https://mlanthology.org/aaai/2021/fu2021aaai-lrc/}
}