Students Need More Attention: BERT-Based Attention Model for Small Data with Application to Automatic Patient Message Triage

Abstract

Small and imbalanced datasets commonly seen in healthcare represent a challenge when training classifiers based on deep learning models. So motivated, we propose a novel framework based on BioBERT (Bidirectional Encoder Representations from Transformers for Biomedical TextMining). Specifically, (i) we introduce Label Embeddings for Self-Attention in each layer of BERT, which we call LESA-BERT, and (ii) by distilling LESA-BERT to smaller variants, we aim to reduce over fitting and model size when working on small datasets. As an application, our framework is utilized to build a model for patient portal message triage that classifies the urgency of a message into three categories: non-urgent, medium and urgent. Experiments demonstrate that our approach can outperform several strong baseline classifiers by a significant margin of 4.3% in terms of macro F1 score. The code for this project is publicly available at https://github.com/shijing001/text_classifiers

Cite

Text

Si et al. "Students Need More Attention: BERT-Based Attention Model for Small Data with Application to Automatic Patient Message Triage." Proceedings of the 5th Machine Learning for Healthcare Conference, 2020.

Markdown

[Si et al. "Students Need More Attention: BERT-Based Attention Model for Small Data with Application to Automatic Patient Message Triage." Proceedings of the 5th Machine Learning for Healthcare Conference, 2020.](https://mlanthology.org/mlhc/2020/si2020mlhc-students/)

BibTeX

@inproceedings{si2020mlhc-students,
  title     = {{Students Need More Attention: BERT-Based Attention Model for Small Data with Application to Automatic Patient Message Triage}},
  author    = {Si, Shijing and Wang, Rui and Wosik, Jedrek and Zhang, Hao and Dov, David and Wang, Guoyin and Carin, Lawrence},
  booktitle = {Proceedings of the 5th Machine Learning for Healthcare Conference},
  year      = {2020},
  pages     = {436-456},
  volume    = {126},
  url       = {https://mlanthology.org/mlhc/2020/si2020mlhc-students/}
}