Harnessing the Power of Prompt Experts: Efficient Knowledge Distillation for Enhanced Language Understanding

Abstract

Enhanced with machine learning, language understanding enables computers to not only comprehend but also learn from human language, thereby augmenting the capabilities of various NLP applications in AI. Multi-teacher distillation is a prominent method for knowledge transfer in language understanding, leveraging multiple teacher models to train a single student model. However, this approach incurs significant time and storage costs for training and inference with multiple teachers. To address these issues, we introduce PEE-KD, a simple yet effective framework that generates supervision for training a student model from a single language model. We implemented a language model with multiple prompts as the teacher model in multi-teacher distillation, achieving lightweight training and inference. Additionally, we propose an uncertainty-based method to enhance the robustness and accuracy of multiple prompts during training, along with a selector module to improve the inference speed of multi-teacher models. Experiments on NLU and NER tasks demonstrate that PEE-KD improves accuracy by up to 1.8% and efficiency by up to 140% compared to existing methods. Logit visualization comparisons between teacher and student models further validate the effectiveness of our approach. Our code and data are available at https://anonymous.4open.science/r/PEEKD-DF50/ .

Cite

Text

Meng et al. "Harnessing the Power of Prompt Experts: Efficient Knowledge Distillation for Enhanced Language Understanding." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2024. doi:10.1007/978-3-031-70371-3_13

Markdown

[Meng et al. "Harnessing the Power of Prompt Experts: Efficient Knowledge Distillation for Enhanced Language Understanding." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2024.](https://mlanthology.org/ecmlpkdd/2024/meng2024ecmlpkdd-harnessing/) doi:10.1007/978-3-031-70371-3_13

BibTeX

@inproceedings{meng2024ecmlpkdd-harnessing,
  title     = {{Harnessing the Power of Prompt Experts: Efficient Knowledge Distillation for Enhanced Language Understanding}},
  author    = {Meng, Xv and Rao, Jun and Qi, Shuhan and Wang, Lei and Xiao, Jing and Wang, Xuan},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2024},
  pages     = {218-234},
  doi       = {10.1007/978-3-031-70371-3_13},
  url       = {https://mlanthology.org/ecmlpkdd/2024/meng2024ecmlpkdd-harnessing/}
}