Med-EASi: Finely Annotated Dataset and Models for Controllable Simplification of Medical Texts

Abstract

Automatic medical text simplification can assist providers with patient-friendly communication and make medical texts more accessible, thereby improving health literacy. But curating a quality corpus for this task requires the supervision of medical experts. In this work, we present Med-EASi (Medical dataset for Elaborative and Abstractive Simplification), a uniquely crowdsourced and finely annotated dataset for supervised simplification of short medical texts. Its expert-layman-AI collaborative annotations facilitate controllability over text simplification by marking four kinds of textual transformations: elaboration, replacement, deletion, and insertion. To learn medical text simplification, we fine-tune T5-large with four different styles of input-output combinations, leading to two control-free and two controllable versions of the model. We add two types of controllability into text simplification, by using a multi-angle training approach: position-aware, which uses in-place annotated inputs and outputs, and position-agnostic, where the model only knows the contents to be edited, but not their positions. Our results show that our fine-grained annotations improve learning compared to the unannotated baseline. Furthermore, our position-aware control enhances the model's ability to generate better simplification than the position-agnostic version. The data and code are available at https://github.com/Chandrayee/CTRL-SIMP.

Cite

Text

Basu et al. "Med-EASi: Finely Annotated Dataset and Models for Controllable Simplification of Medical Texts." AAAI Conference on Artificial Intelligence, 2023. doi:10.1609/AAAI.V37I12.26649

Markdown

[Basu et al. "Med-EASi: Finely Annotated Dataset and Models for Controllable Simplification of Medical Texts." AAAI Conference on Artificial Intelligence, 2023.](https://mlanthology.org/aaai/2023/basu2023aaai-med/) doi:10.1609/AAAI.V37I12.26649

BibTeX

@inproceedings{basu2023aaai-med,
  title     = {{Med-EASi: Finely Annotated Dataset and Models for Controllable Simplification of Medical Texts}},
  author    = {Basu, Chandrayee and Vasu, Rosni and Yasunaga, Michihiro and Yang, Qian},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2023},
  pages     = {14093-14101},
  doi       = {10.1609/AAAI.V37I12.26649},
  url       = {https://mlanthology.org/aaai/2023/basu2023aaai-med/}
}